Reactivity-dependent and interaction-dependent pcr

ABSTRACT

Methods, reagents, compositions, and kits for reactivity-dependent polymerase chain reaction (RD-PCR) and interaction-dependent polymerase chain reaction (ID-PCR) are provided herein. RD-PCR is a technique useful for determining whether a reactive moiety can form a covalent bond to a target reactive moiety, for example, in screening a library of candidate reactive moieties for reactivity with a target reactive moiety, and in identifying an enzyme substrate, for example, in protease substrate profiling. ID-PCR is a technique useful for determining whether a ligand can non-covalently bind to a target molecule, for example, in screening a library of candidate ligands for non-covalent interaction with a target molecule. RD-PCR and ID-PCR are also useful in detecting the presence of an analyte or an environmental condition.

RELATED APPLICATIONS

This application is a division of and claims priority under 35 U.S.C.§120 to U.S. application Ser. No. 13/505,872, filed Jul. 16, 2012, whichis a national stage filing under 35 U.S.C. §371 of international PCTapplication, PCT/US2010/002732, filed Oct. 13, 2010, which claimspriority under 35 U.S.C. §119(e) to U.S. provisional patent application,U.S. Ser. No. 61/257,983, filed Nov. 4, 2009, each of which isincorporated herein by reference.

GOVERNMENT SUPPORT

This invention was made with Government support under grant R01GM065865awarded by the National Institute of General Medical Sciences and grantGM065865 awarded by the National Institutes of Health. The Governmenthas certain rights in this invention.

BACKGROUND OF THE INVENTION

Current approaches for identifying functional nucleic acids and smallmolecules from libraries of such molecules are generally indirect. Theyrequire the synthesis of tagged substrates and typically involvemultiple manipulations. Thus, current approaches leave much to bedesired for the efficient screening of libraries of rapidly increasingcomplexity.

In vitro selection is a key component of efforts to discover functionalnucleic acids and small molecules from libraries of DNA, RNA, and smallmolecules.(1) When the desired activity is binding affinity, as is thecase for aptamer evolution(2) or for the discovery of DNA-linked smallmolecules that bind to a particular target,(3) a direct selection ispossible; the library is typically incubated with immobilized targetmolecules, and bound library members are washed and eluted before beingsubjected to PCR amplification (FIG. 1a ).

In vitro selections have also been developed to evolve RNA and DNAcatalysts(4) and, more recently, to discover new reactions fromDNA-encoded libraries of potential substrates.(5) In these selections,library members may undergo bond formation or bond cleavage. Selectionsfor reactivity are significantly more complicated than selections forbinding affinity. Typically, libraries are incubated with biotinylatedsubstrates or potential reaction partners. Bond formation results in theattachment of biotin to a library member, which in turn enables itscapture by immobilized avidin (FIG. 1b ).(6) For bond cleavage, aninverse approach is commonly used in which immobilized, biotinylatedlibrary members are liberated upon bond scission.(7) While effective,such selections for chemical reactivity are indirect, require thesynthesis of biotin-linked substrates, and involve multiplesolution-phase and/or solid-phase manipulations. Therefore, betterapproaches to selection for chemical reactivity are needed to moreefficiently screen complex libraries of chemical compounds for thediscovery of new chemical reactions and interactions. References (1)-(7)are identified in Example 1.

SUMMARY OF THE INVENTION

In vitro selection of reaction or binding partners is an important wayof discovering molecules with desired properties from libraries ofcandidate molecules, for example, from DNA, RNA, protein, peptide, andsmall molecule libraries. Such selections have been widely used toevolve RNA and DNA catalysts and, more recently, to discover newreactions from DNA-encoded libraries of potential substrates. Based onthe observation that the melting temperature (T_(m)) of adouble-stranded nucleic acid is substantially higher when hybridizationoccurs intramolecularly as opposed to intermolecularly, the presentinvention provides a system for reactivity-dependent orinteraction-dependent polymerase chain reaction, a new in vitroselection technology that more directly links bond formation, bondcleavage, or a molecular interaction with the amplification of a desiredsequence. This new system obviates the need for solid-phase capture,washing, and elution steps. This technology can be used to select forbond formation in the context of reaction discovery. It can also be usedto identify cleavage sites in the context of protease or nucleaseactivity profiling. For example, reactivity-dependent PCR (RD-PCR) canbe used for the identification of protease substrate amino acidsequences and for the identification of nuclease substrate nucleotidesequences. RD-PCR can also be useful in the evolution of ribozymes andDNAzymes. Further, this technology can be useful in identifying DNAbinding site preferences of transcription factors and other DNA-bindingproteins, for example, zinc-finger endonucleases. ID-PCR can be used inthe identification of ligands that bind therapeutically relevanttargets. Accordingly, ID-PCR can be useful in identifying agonists andantagonists of therapeutic targets.

Some aspects of this invention are based on the melting temperature(T_(m)) difference between duplex DNA formed intramolecularly versusintermolecularly. This difference can be exploited to couple covalentbond formation or a non-covalent association with PCR amplification.This invention stems from the discovery that a nucleic acid template canbe efficiently amplified in a polymerase chain reaction (PCR) using aprimer that is conjugated to the template, for example, a primer that iscovalently attached to the template or a primer that is non-covalentlyassociated with the template, under suitable conditions (e.g.,temperature, salt concentration), not allowing for annealing andefficient extension of a primer not conjugated to or associated(covalently or non-covalently) with the template.

Some aspects of this invention relate to exploiting this discovery todetermine whether two reactive moieties, a candidate reactive moiety anda target reactive moiety, can form a covalent bond under certain testconditions (see FIG. 2). Reactivity-dependent PCR (RD-PCR) featuresefficient amplification of a PCR product only after a covalent bond isformed between a candidate reactive moiety coupled to a template and atarget reactive moiety coupled to a primer. Reactive moieties mayinclude, but are not limited to, such functional groups as amines,thiols, azides, hydroxyls, alkenes, alkynes, alkyl halides, dienes, acylhalides, esters, amides, etc. Some aspects of this invention relate tothe application of RD-PCR to in vitro selection and chemical libraryscreening. For example, RD-PCR may be used in the discovery anddevelopment of methods combining organic or bond-forming chemistry withPCR methodology. In certain embodiments, this invention relates tomethods including coupling a candidate reactive moiety of a primer to anucleic acid template including a sequence tag identifying itsrespective candidate reactive moiety. Some aspects of this inventionrelate to methods of contacting such nucleic acid templates with atarget reactive moiety coupled to a primer and a subsequentamplification of only those nucleic acid templates coupled to a primerthrough a covalent bond formed between the two reactive moieties (seeFIG. 2b ).

In one aspect, this invention provides a method for determining whethera candidate reactive moiety can form a covalent bond with a targetreactive moiety. In another aspect, the invention provides a method forscreening a library of candidate reactive moieties for their ability toform a covalent bond with a target reactive moiety. In one aspect, theinvention provides a method for determining whether a candidate ligandcan bind to a target compound (e.g., a small molecule) or biomolecule(e.g., a peptide or nucleic acid), biomolecule derivative, or fragmentthereof. In another aspect, the invention provides a method forscreening a library of candidate ligands for their ability to bind atarget compound or biomolecule, biomolecule derivative, or fragmentthereof.

In one aspect, the present invention provides a method for screening alibrary of candidate structures, for example, polypeptide sequences ornucleotide sequences, to identify a structure, for example, an aminoacid sequence or a nucleotide sequence, that is a substrate of an enzymeof interest. Such a method is, for example, particularly useful indetermining the target sequence of a protease or a nuclease. Manyproteases or nucleases cleave at a specific site within a target peptideor nucleotide sequence, but often a protease or nuclease recognizes anumber of more or less similar target sequences. In certain embodiments,this invention provides methods to identify a protease-substrate peptideor a nuclease-substrate nucleotide sequence in a library of candidatesubstrates. In some embodiments, a library of nucleic acid templatescoupled to candidate substrate peptides is provided and contacted with aprotease of interest and a primer coupled to a reactive moiety. Cleavageof a candidate substrate peptide by the protease generates a reactivemoiety (e.g., an amino group) that can subsequently form a covalent bondwith a primer, while uncleaved candidate substrate peptides cannot formsuch a bond. PCR is then used to selectively amplify those nucleic acidtemplates coupled to candidate substrate peptides that have been cleavedand, thus, include a target sequence for the respective protease. Insome embodiments, the sequence of the substrate peptides is determinedby identifying the sequence tag of the amplified nucleic acid templates.Similarly, a library of candidate nucleic acid sequences may be screenedin order to identify a target nucleic acid sequence for a givennuclease. In such embodiments, a library of nucleic acid templatescoupled to a candidate nucleic acid substrate is provided, contactedwith a nuclease of interest and target nucleotide sequences areidentified by RD-PCR in analogy to the identification of proteasesubstrate sequences described herein. A consensus target sequence andenzyme preference for variations of such a consensus sequence can bedetermined if multiple target sequences are discovered. Such protease ornuclease profiling information can be useful, for example, forpredicting targets of a specific protease. Accordingly, in oneembodiment, the invention provides a method for protease activityprofiling, involving identifying a plurality of substrate polypeptidesof a protease of interest. Further, other aspects of this inventionprovide a method for determining a consensus binding sequence of aprotease of interest. In one embodiment, the invention provides a methodfor nuclease activity profiling, involving identifying a plurality ofsubstrate polynucleotide sequences of a nuclease of interest. Further,other aspects of this invention provide a method for determining aconsensus binding sequence of a nuclease of interest.

In some embodiments, RD-PCR may be used to identify a functional nucleicacid (e.g., a ribozyme or DNAzyme) that cleaves a given target orcatalyzes the formation of a specific reactive moiety or covalent bond.In such embodiments, a library of templates may be provided, whichinclude a candidate functional nucleic acid sequence (e.g., a ribozymeor DNAzyme), a first primer hybridization site, a PCR primerhybridization site, and, optionally, a reactive moiety, and a sequencetag. In some embodiments, a template including a candidate functionalnucleic acid coupled to a specific substrate is contacted with a primercoupled to a reactive moiety that can form a covalent attachment to asubstrate only after the substrate has been modified by the functionalnucleic acid, for example, by hydrolysis, acylation, phosphorylation,dephosphorylation, ligation, or other enzymatic reaction. In someembodiments, the candidate functional nucleic acid is a cis-actingnucleic acid. In some embodiments, a library of templates includingcandidate functional nucleic acids is screened to identify a functionalnucleic acid able to catalyze a specific reaction, for example, cleavageof a nucleic acid target sequence. If a given template includes afunctional nucleic acid that can perform the desired function, e.g.,cleave a specific target nucleotide sequence or modify a reactive moietyin a way that a covalent bond between the template and the first primercan be formed, then the respective template sequence can be amplified inan RD-PCR reaction. In certain embodiments, a functional nucleic acid isidentified by the sequence tag associated with the template. In certainembodiments, a sequence tag may be dispensable if the sequence of thefunctional nucleic acid can directly be determined from the respectiveRD-PCR product.

In other aspects of the invention, methods are provided that depend onnon-covalent binding or association of a template and a first primer forPCR amplification instead of covalent bond formation between the two.This technology, also referred to as interaction-dependent PCR (ID-PCR),is useful to determine whether a candidate molecule, for example, apeptide or small molecule, can bind to a target molecule, for example, aprotein of interest. In certain embodiments, a library of candidatemolecules is screened for molecules that can bind non-covalently to atarget molecule. In some embodiments, methods are provided to screen alibrary of candidate ligands against a library of candidate bindingmolecules, for example, a library of small molecules is screened againsta library of polypeptides, to identify pairs of binding partners. Insome embodiments, these pairs of binding partners are relevant to aspecific biological pathway, and the identified ligands can be used asleads for the development of drugs targeting that biological pathway.

Some aspects of the invention relate to the use of RD-PCR or ID-PCR asan environmental sensor or a chemical sensor. In some embodiments, anRD-PCR reaction strategy or an ID-PCR binding strategy is provided inwhich the formation of a covalent bond or a non-covalent associationdepends on the presence or absence of a particular agent or analyte, forexample, an oxidant, an enzyme, or an ion. Because of the highsensitivity of PCR, RD-PCR and ID-PCR strategies are particularly usefulfor the detection of low-abundance analytes.

Some aspects of the invention relate to reagents and kits useful toperform RD-PCR and/or ID-PCR. Reagents useful for performing RD-PCRand/or ID-PCR include, for example, reactive moieties, ligands or targetmolecules, reagents necessary to couple such moieties, ligands, ortarget molecules to a nucleic acid, nucleic acids, nucleic acids coupledto target reactive moieties, nucleic acids coupled to candidate reactivemoieties, nucleic acids including a primer hybridization site and/or asequence complementary to a primer hybridization site, and/or PCRreagents (e.g., buffer, polymerase, and/or nucleotides). Such reagentsor a subset thereof may be conveniently packaged in a kit for use by aresearcher. The kit may include instructions for using the components inRD-PCR and/or ID-PCR.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Traditional approaches to in vitro selection.

FIG. 2. Principles underlying reactivity-dependent PCR (RD-PCR).Conditions in (a): 10 nM DNA, 2 mM Mg²⁺, 100 mM NaCl.

FIG. 3. Comparison of PCR efficiency of intramolecularly primed versusintermolecularly primed DNA templates. PCR conditions for PAGE samples:19 fmol of 8 or 19 fmol of 4a+5 in 30 μL, 25 cycles.

FIG. 4. Non-natural hairpin linkers support self-priming PCR.R=—(CH₂)₆OH.

FIG. 5. Selectivity of RD-PCR in a library-format mock selection. PCRconditions: 19 fmol of 12 and 13 in 60 μL, 25-35 cycles.

FIG. 6. RD-PCR-based DNA-encoded reaction discovery selection. PCRconditions for PAGE samples: 1 fmol of 9 in 20 μL, 23 cycles.

FIG. 7. RD-PCR-based protease-mediated peptide cleavage selection. PCRconditions for PAGE samples: 19 fmol of DNA in 30 μL, 23 cycles (lanes1-5) or 25 cycles (lanes 6-9). D=DMT-MM; E=EDC+sNHS.

FIG. 8. Optimization of stem length.

FIG. 9. Cycle Threshold (CT) is linearly correlated to initial log [9a].

FIG. 10. Synthesis of 9b, a substrate containing an azide and analdehyde reactive moiety.

FIG. 11. Disulfide-linked hairpin supports self-priming PCR. PCRconditions: 500 pM 9b or 500 pM 4b and 500 pM 10b, 20 cycles.

FIG. 12. An example of amide formation-dependent PCR.

FIG. 13. Fluorescence quantitation of the DNA-templated acylationreaction.

FIG. 14. Library-format experiment with PvuII.

FIG. 15. RD-PCR-based DNA-encoded reaction discovery selectionvalidation using a copper-catalyzed Huisgen cycloaddition reaction. PCRconditions: 5 pM template, 24 cycles.

FIG. 16. Principles underlying interaction-dependent PCR (ID-PCR).Non-covalent binding between a candidate ligand and a target aredisplayed as examples of ID-PCR.

FIG. 17. ID-PCR-based reaction transducing non-covalent binding into aPCR template.

FIG. 18. ID-PCR analyte binding strategy: single-point analyte binding.

FIG. 19. ID-PCR analyte binding strategy: sandwich binding.

FIG. 20. (a) Overview of ID-PCR. (b) ID-PCR with SA (streptavidin) asthe target (1a-SA) and biotin (2a-biotin, K_(d)=40 pM) or desthiobiotin(2b-desthiobiotin, K_(d)=2 nM) as ligands was analyzed by qPCR and PAGE(21 cycles of PCR). ID-PCR reports the interaction of (c) trypsin andantipain (K_(i)=100 nM); (d) CA and carboxy benzene sulfonamide (CBS,K_(i)=3.2 μM) or Gly-Leu-CBS (GLCBS, K_(i)=9 nM); and (e) a DNA aptamerand daunomycin (Dn) (Dn, K_(d)=272 nM) or doxorubicin (Dx). PAGE gels in(c), (d), and (e) show DNA after 20, 24, and 23 cycles of PCR,respectively.

FIG. 21. (a) ID-PCR with a single target in the presence of mock ligandlibrary. (b) Mixtures of 2i-biotin and excess 2k-GLCBS were subjected toID-PCR with 1a-SA or 1. (c) Mixtures of 2k-GLCBS and excess 2i-biotinwere subjected to ID-PCR with 1c-CA or 1. (d) Mixtures of 2n-Dn andexcess 2l were subjected to ID-PCR against 1f-aptamer or 1h. (e)Mixtures of 1g-aptamer and excess unstructured DNA (1h) were subjectedto ID-PCR with 2g-Dn or 2f. The DNA in (b), (c), (d), and (e) wasdigested with EcoRI, HindIII, NsiI, or NsiI, respectively.

FIG. 22. (a) A model library of DNA-encoded ligands mixed with a modellibrary of DNA-encoded targets allows multiplexed detection of bindingpairs. (b) ID-PCR was used to perform a model selection on an equimolar261-membered DNA-ligand library and an equimolar 259-member DNA-targetlibrary containing five known protein-ligand pairs out of 67,599possible combinations. For each protein target, the most highly enrichedsequences (A-E) relative to a control lacking proteins corresponded tothe known protein-ligand pairs, labeled A-E in the plot. A: biotin+SA;B: desthiobiotin+SA; C: GLCBS+CA; D: CBS+CA; E: trypsin+antipain.

FIG. 23. Optimization of complementary region length.

FIG. 24. The affect of ligand-oligonucleotide linker length on ID-PCRefficiency.

FIG. 25. Affinities of ligands and ligand-DNA conjugates for proteintargets.

FIG. 26. Detection of multivalent analytes by ID-PCR.

FIG. 27. PAGE characterization of DNA-target conjugates.

DEFINITIONS

As used herein and in the claims, the singular forms “a,” “an,” and“the” include the plural reference unless the context clearly indicatesotherwise. Thus, for example, a reference to “an agent” includes aplurality of such agents.

As used herein, the term “amplicon” refers to a nucleic acid moleculethat is amplified in a polymerase chain reaction. In RD-PCR and ID-PCRreactions, an amplicon is typically the nucleic acid template or aportion thereof. In library screening RD-PCR or ID-PCR reactions, theamplicon typically includes the sequence tag of the nucleic acidtemplate.

As used herein, the term “analyte,” interchangeably used with the term“environmental parameter,” refers to a component of a sample (e.g., anenvironmental or biological sample), the presence or absence of whichcan be determined by an RD-PCR or ID-PCR assay. In some RD-PCRembodiments, the analyte is a cofactor to a reaction, for example amolecule or condition without which a covalent bond between two reactivemoieties is not efficiently formed. In some ID-PCR embodiments, theanalyte is a cofactor to an interaction, for example a molecule orcondition without which a non-covalent interaction between a ligand anda binding molecule is not efficiently formed. Non-limiting examples foranalytes are a cofactor to an enzymatic reaction, an enzyme, a bindingmolecule, a catalyst of a chemical reaction, and oxidizing conditions.

As used herein, the term “catalyst” refers to a chemical substance ableto increase or decrease the rate of a chemical reaction. A catalyst maybe heterogeneous (existing in a different phase than the substrate) orhomogeneous (existing in the same phase as the substrate). For example,if the reaction is, for example, a covalent bond-forming reactionbetween two reactive moieties that takes place in an aqueous solution, acatalyst may be, for example, in solution, a solid, or in colloidalform. A catalyst may be an inorganic catalyst, for example, an ion or ametal surface; an organometallic catalyst; or an organic catalyst. Acatalyst may include, for example, an ion, for example, a Cu, Mg, Zn,Pb, Pd, Pt, Ca, or Fe ion; a protein, for example, an enzyme; a nucleicacid, for example, a ribozyme or DNAzyme; or a small molecule. Thechoice of a suitable catalyst will depend, of course, on the specificchemical reaction and the reactants. Catalysts for many differentchemical reactions are well known in the art.

As used herein, the term “contacting” refers to bringing a firstmolecule, for example, a nucleic acid molecule (e.g., a nucleic acidtemplate including a reactive moiety), and a second molecule, forexample, a second nucleic acid molecule (e.g. a primer), optionallyincluding a second reactive moiety, together in a manner that themolecules can bind, hybridize, and/or react. Contacting may beaccomplished in a cell-free system, for example, by adding a secondmolecule to a solution including a first molecule under suitableconditions. Conditions suitable for nucleic acid hybridization andvarious chemical reactions are well known in the art.

As used herein, the term “covalent bond” refers to a form of chemicalbonding that is characterized by the sharing of one or more pairs ofelectrons between atoms. Reactions forming a covalent bond between tworeactive moieties are well known in the art, and include, for example,acylation reactions, addition reactions, nucleophilic substitutionreactions, cycloaddition reactions, carbonyl chemistry reactions, “nonaldol”-type carbonyl chemistry reactions, carbon-carbon bond formingreactions, and addition reactions to carbon-carbon double or triplebonds. A covalent bond formed between two reactive moieties may, forexample, be an amide bond, an acyl bond, a disulfide bond, an alkylbond, an ether bond, or an ester bond. A covalent bond formed betweentwo reactive moieties may be, for example, a carbon-carbon bond, acarbon-oxygen bond, a carbon-nitrogen bond, a carbon-sulfur bond, asulfur-sulfur bond, a carbon-phosphorus bond, a phosphorus-oxygen bond,or a phosphorus-nitrogen bond.

As used herein the term “enzyme” refers to a molecule, for example, apeptide, a protein, or a nucleic acid (for example, a ribozyme orDNAzyme) that catalyzes a chemical reaction. An enzyme may be abiomolecule (a molecule made by a living organism), a derivative of abiomolecule (e.g., a mutated biomolecule, a fragment of a biomolecule,and/or a fusion product of a biomolecule, or fragment thereof, with asecond molecule), or an artificially made molecule (e.g., a syntheticprotein or nucleic acid). An enzyme may be an oxidoreductase,transferase, polymerase, hydrolase lyase, synthase, isomerase, orligase. Accordingly, a protease and a nuclease are non-limiting examplesof enzymes. In certain embodiments, the enzyme is a protein. In certainembodiments, the enzyme is a nucleic acid. In certain embodiments, theenzyme is RNA. In certain embodiments, the enzyme is DNA.

As used herein, the term “enzyme substrate” is a molecule upon which anenzyme acts. An enzyme substrate is bound by an enzyme and transformedinto one or more products in a chemical reaction catalyzed by theenzyme. The reaction product or products are usually released from theenzyme. For example, a protease catalyzes the hydrolysis of an amidebond in a protease substrate peptide or protein. The substrate peptideof a protease is generally bound specifically, meaning that only apeptide of a certain amino acid sequence or with a sequence similar to aconsensus sequence is bound by the protease and cleaved into two or morefragments in a hydrolysis reaction.

As used herein, the term “functional nucleic acid” refers to a nucleicacid with enzymatic activity, binding activity, or biological activity.Ribozymes and DNAzymes are non-limiting examples for functional nucleicacids.

As used herein, the term “interaction-dependent polymerase chainreaction” (ID-PCR) refers to a PCR assay in which amplification of anucleic acid template depends upon the nucleic acid template having anon-covalent association with a PCR primer. The non-covalent associationis preferably a high-affinity association, for example, characterized bya K_(D) of 10⁻⁶ or less. The non-covalent association may be formedbetween a ligand attached to the nucleic acid template and a bindingmolecule attached to the primer.

The term “ligand” as used herein, refers to a binding molecule thatbinds non-covalently to a second binding molecule with high affinity. Insome embodiments, a high-affinity bond is characterized by a K_(D)<10⁻⁶,a K_(D)<10⁻⁷, a K_(D)<10⁻⁸, a K_(D)<10⁻⁹, a K_(D)<10⁻¹⁰, a K_(D)<10⁻¹¹,or a K_(D)<10⁻¹². In some embodiments, the ligand is a small molecule.In some embodiments, the ligand is a peptide or protein. In someembodiments, the ligand is a nucleic acid.

As used herein, the term “library of nucleic acid templates” refers to aplurality of nucleic acid templates. In some embodiments, each nucleicacids template of a library of nucleic acid templates is bound to one ofvarious reactive moieties. In some embodiments, a library of nucleicacid templates includes nucleic acid templates bound to the same type ofreactive moiety, for example, a library of polypeptide-associatednucleic acid templates may include only nucleic acid templates bound topolypeptides. In some embodiments, each nucleic acid template is boundto a specific reactive moiety, wherein the specific reactive moiety anucleic acid template is bound to can be identified by the nucleic acidtemplate's sequence tag. For example, a specific sequence tag mayidentify a peptide-associated nucleic acid template to be bound to thepeptide Ala-Pro-Gly-Phe-Ala (SEQ ID NO: 1), whereas a different nucleicacid template of the same library with a different sequence tag is boundto a different peptide.

The term “melting temperature” (T_(m)) is an art-recognized term andrefers to the temperature at which hybridization of two nucleotidestrands is destabilized so that the two nucleotide strands separate (ordissociate). In PCR, the melting temperature is the temperature at whicha primer hybridized to a template dissociates from the template.

As used herein, the term “non-covalent bond”, interchangeably used withthe term “non-covalent interaction” herein, refers to a type ofinteraction between two molecules that does not involve the sharing ofelectrons between the molecules, but involves variations ofelectromagnetic, electrostatic, or hydrophobic interactions.

As used herein, the term “nucleic acid,” interchangeably used with theterms “nucleic acid template,” “nucleic acid molecule,”“polynucleotide,” and “oligonucleotide,” refers to a polymer ofnucleotides. Typically, a polynucleotide comprises at least twonucleotides. DNAs and RNAs are polynucleotides. The polymer may includenatural nucleosides (i.e., adenosine, thymidine, guanosine, cytidine,uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, anddeoxycytidine), nucleoside analogs (e.g., 2-aminoadenosine,2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine,C5-propynylcytidine, C5-propynyluridine, C5-bromouridine,C5-fluorouridine, C5-iodouridine, C5-methylcytidine, 7-deazaadenosine,7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine,and 2-thiocytidine), chemically modified bases, biologically modifiedbases (e.g., methylated bases), intercalated bases, modified sugars(e.g., 2′-fluororibose, 2′-methoxyribose, 2′-aminoribose, ribose,2′-deoxyribose, arabinose, and hexose), and/or modified phosphate groups(e.g., phosphorothioates and 5′-N phosphoramidite linkages). Enantiomersof natural or modified nucleosides may also be used. Nucleic acids alsoinclude nucleic acid-based therapeutic agents, for example, nucleic acidligands, siRNA, short hairpin RNA, antisense oligonucleotides,ribozymes, aptamers, and SPIEGELMERS™, oligonucleotide ligands describedin Wlotzka, et al., Proc. Natl. Acad. Sci. USA, 2002, 99(13):8898, theentire contents of which are incorporated herein by reference. A nucleicacid may further include a non-nucleic acid moiety or molecule, forexample, a reactive moiety or a binding molecule, such as a ligand.

As used herein, the term “nucleic acid linker” refers to a nucleic acidmolecule including a primer hybridization site. A nucleic acid linkermay be single-stranded or double-stranded. A double-stranded nucleicacid linker may include a nucleic acid overhang compatible with aspecific restriction site in a ligation reaction. Alternatively, adouble-stranded nucleic acid linker may be blunt-ended. A nucleic acidlinker may be ligated to a nucleic acid molecule in order to add aprimer hybridization sites or a restriction site.

The term “polymerase chain reaction” (PCR) is an art recognized term andrefers to a method of amplifying a nucleic acid molecule. PCR usesthermal cycling, consisting of cycles of repeated heating and cooling ofa PCR sample including the nucleic acid molecule to be amplified. Atypical PCR cycle includes a denaturation (or melting) step, anannealing step, and an elongation (or extension) step. A typical PCRincludes between 12 and 40 cycles. A PCR may further include aninitialization step, for example, if each activation of a hot startpolymerase is performed, a hold step, a final extension or hold step,and a final cooling step. PCR reagents include a buffer, for example, abuffer including Mg²⁺ ions, one or more primers, nucleotides, and athermophilic polymerase, for example, Taq, Pfu, Pwo, Tfl, rTth, Tli,Tma, Bst, 9° N_(m), Vent, or Phusion polymerase. A PCR product is anucleic acid generated as a result of a PCR. PCR protocols are wellknown in the art, for example, as described in Chapter 8 (“In vitroamplification of DNA by the polymerase chain reaction”) of Sambrook etal., Molecular Cloning: A laboratory Manual, Volumes 1-3, Cold SpringHarbor Laboratory Press, 2001. Reagents and reagent kits for PCR areavailable from numerous commercial suppliers.

The term “quantitative PCR” (qPCR) refers to a method used to measurethe quantity of a PCR product. If the quantity of a PCR product ismeasured in real time, the method is referred to as “quantitative,real-time PCR”.

A “polypeptide”, “peptide”, or “protein” comprises a string of at leastthree amino acids linked together by peptide bonds. The terms“polypeptide”, “peptide”, and “protein”, may be used interchangeably.Peptide may refer to an individual peptide or a collection of peptides.Inventive peptides preferably contain only natural amino acids, althoughnon natural amino acids (i.e., compounds that do not occur in nature butthat can be incorporated into a polypeptide chain) and/or amino acidanalogs as are known in the art may alternatively be employed. Also, oneor more of the amino acids in a peptide may be modified, for example, bythe addition of a chemical entity such as a carbohydrate group, aphosphate group, a farnesyl group, an isofarnesyl group, a fatty acidgroup, a linker for conjugation, functionalization, or othermodification, etc. In one embodiment, the modifications of the peptidelead to a more stable peptide (e.g., greater half-life in vivo). Thesemodifications may include cyclization of the peptide, the incorporationof D-amino acids, etc. None of the modifications should substantiallyinterfere with the desired biological activity of the peptide.

As used herein, the term “primer” refers to a nucleic acid molecule thatcan hybridize to a primer hybridization site of a nucleic acid templatevia base pairing and that can be elongated by a polymerase, for example,Taq, Pfu, Pwo, Tfl, rTth, Tli, Tma, Bst, 9° N_(m), Vent, or Phusionpolymerase during a PCR. A primer, accordingly, includes a free 3′-OHgroup or other group amenable to the addition of nucleotide monomers bya polymerase. In some embodiments, only a 3′ portion of the primerhybridizes to the primer hybridization site. In other embodiments, thewhole primer hybridizes to the primer hybridization site. A primerincludes a nucleotide sequence complementary to that of the primerhybridization site it hybridizes to. It should be noted, that primerhybridization may tolerate nucleotide-nucleotide mismatches, and,therefore, “complementary” does not require complete complementarity,but only a degree of complementarity sufficient for hybridization.Typically, a primer includes between 18 to 35 nucleotides. However, aprimer may be longer or shorter than that, for example, ranging inlength from 5-100 nucleotides. In a PCR, a primer hybridizes with aprimer hybridization site of a nucleic acid template during theannealing step, is elongated by nucleotide addition in the elongationstep, and the hybridization of elongated primer and template are brokenduring the denaturing step. If a primer is covalently bound to thenucleic acid molecule including the primer hybridization site, thesequence hybridizing with the nucleic acid template may be as short as5-10 nucleotides, for example, the hybridizing sequence of the primermay be 5, 6, 7, 8, 9, or 10 nucleotides long.

As used herein, the term “primer extension,” interchangeably used withthe term “primer elongation”, refers to the extension of a primer thathybridizes to a nucleic acid template by the addition of nucleotidescomplementary to the nucleic acid sequence of the template. In a PCR,this primer extension is usually performed by a thermophilic polymerase,for example, Taq, Pfu, Pwo, Tfl, rTth, Tli, Tma, Bst, 9° N_(m), Vent, orPhusion polymerase.

As used herein, the term “primer hybridization site” refers to anucleotide sequence that a primer can hybridize to. A primerhybridization sites may be part of a nucleic acid template. The primerhybridization site may be 100% homologous to the primer sequence, or maybe less than 100% homologous (e.g., 99.9%, 99%, 98%, 97%, 96%, 95%, 90%,85% homologous). The length and sequence of a primer hybridization siteis dependent on the specific application. Length and nucleotide sequencecan impact PCR parameters such as annealing temperature and cyclelength. Usually, a primer hybridization site is between 10-40 baseslong. In some embodiments, a primer hybridization site may be shorterthan that, depending on primer sequence and intended hybridizationparameters. Methods to design primers for annealing and extension inview of hybridization and extension parameters and methods of adaptinghybridization and extension conditions in view of specific primer lengthand/or sequence are well known in the art.

As used herein, the term “protease”, refers to any enzyme that catalyzesa proteolysis reaction, for example, hydrolysis of a peptide bond.Protease substrates are typically polypeptides or proteins. A proteasemay specifically bind only a polypeptide or protein including a specificamino acid sequence, the binding motif, or, alternatively, a proteasemay bind a plurality of polypeptides or proteins including differentbinding motif amino acid sequences that are similar to a consensussequence of the binding motif. A protease binding motif consensussequence can be determined by methods known to those of skill in theart, once a plurality of binding motifs has been identified. A bindingmotif consensus sequence determination may also include a quantitativeanalysis of binding protease binding motif preferences among a pluralityof binding motifs, for example, by measuring reaction rates fordifferent binding motifs.

The term “protease substrate profile” refers to a list of identifiedsubstrate binding motifs of a specific protease. Such a list may includeweighted binding information for each identified protease binding motifand/or a binding motif consensus sequence.

The term “protein,” used interchangeably with the term “polypeptide”herein, refers to a molecule including a polymer of amino acid residueslinked together by peptide bonds. The term, as used herein, refers toproteins, polypeptides, and peptide of any size, structure, or function.Typically, a protein or polypeptide will be at least three amino acidslong. A protein may refer to an individual protein or a collection ofproteins. Inventive proteins preferably contain only natural aminoacids, although non-natural amino acids (i.e., compounds that do notoccur in nature but that can be incorporated into a polypeptide chain)and/or amino acid analogs as are known in the art may alternatively beemployed. Also, one or more of the amino acids in an inventive proteinmay be modified, for example, by the addition of a chemical entity suchas a carbohydrate group, a hydroxyl group, a phosphate group, a farnesylgroup, an isofarnesyl group, a fatty acid group, a linker forconjugation, functionalization, or other modification, etc. A proteinmay also be a single molecule or may be a multi-molecular complex. Aprotein may be just a fragment of a naturally occurring protein orpeptide. A protein may be naturally occurring, recombinant, orsynthetic, or any combination thereof.

As used herein, the term “reactive moiety” refers to a molecular entityor functional group able to form a covalent bond with another reactivemoiety. Accordingly, a reactive moiety may include a reactive functionalgroup, for example, an alkenyl, alkynyl, phenyl, benzyl, halo, hydroxyl,thiol, carbonyl, aldehyde, carbonate ester, carboxylate, carboxyl,ether, ester, carboxyamide, amine, ketimine, aldimine, imide, azido,diimide, cyanate, isocyanide, isocyanate, isothiocyanate, nitrile,sulfide, or disulfide group. A reactive moiety may be part of acompound. The term “compound” refers to any molecule that is to betested, for example, for the ability of a reactive moiety of a compoundto form a covalent bond with a second reactive moiety. A reactive moietyor a compound containing such a moiety can be randomly selected orrationally selected or designed.

As used herein, the term “reactivity-dependent polymerase chainreaction” (RD-PCR) refers to a PCR assay in which amplification of anucleic acid template depends upon the nucleic acid template forming acovalent bond with a primer. The covalent bond may be formed between areactive moiety of the nucleic acid template and a reactive moiety ofthe primer.

As used herein, the term “screening of a reactive moiety library” refersto an experiment to identify a reactive moiety with a specificcharacteristic in a reactive moiety library. Depending on the screeningassay to be performed, as well as the format of the library, theexperimental design of a library screen may vary. For example, areactive moiety library screen may include contacting a plurality ofcandidate reactive moieties in a library in parallel, for example, in asingle solution, with a screening reagent, for example, a targetreactive moiety. As another example, a reactive moiety library screenmay include contacting the plurality of reactive moieties of the libraryindividually, for example, contacting a first candidate reactive moietyin a first solution, a second candidate reactive moiety in a secondsolution, and so forth, for example, in a microtiter-plate format,wherein a well is used to contact an individual candidate reactivemoiety.

As used herein, the term “sequence tag” refers to a nucleotide sequenceused to identify a candidate reactive moiety bound to a nucleic acidmolecule or nucleic acid template. For example, in a library screeningexperiment, a nucleic acid template with a specific first sequence tagmay include a specific candidate reactive moiety, while a nucleic acidtemplate with a different sequence tag may include a different candidatereactive moiety. This way, a specific reactive moiety that forms acovalent bond with a target reactive moiety, for example, in a libraryscreen, in which the nucleic acid templates of the library are contactedin parallel, can be identified by sequencing the sequence tag of anRD-PCR product obtained in the screen. Depending on the complexity ofthe library to be screened, the length of the sequence tag may vary. Asingle nucleotide in a DNA sequence tag including naturally occurringnucleotides can represent one out of four bases A, C, G and T. Thus, asequence tag will allow for the identification of 4^(n) nucleic acidtemplates with n being the number of nucleotides of the sequence tag.For example, a sequence tag including 4 nucleotides could theoreticallyidentify 256 different nucleic acid templates/reactive moieties, asequence tag including 10 nucleotides could theoretically identify1,048,576 nucleic acid templates/reactive moieties. In practice, sometheoretically possible sequence tags, for example, an all-G tag, mayinterfere with RD-PCR template amplification. Sequence tags with veryhigh (>80%) or very low (<20%) GC content may cause problems in nucleicacid amplification during RD-PCR or ID-PCR, as may sequence tags showingself-complementarity or complementarity to any part of the nucleic acidtemplate or other nucleic acids used in the RD-PCR or ID-PCR reaction.It is well known to those in the art how to design sequence tags and howto avoid high and low GC-content in designing nucleic acid components,for example, primers and templates, for PCR. As a result, the practicalamount of useful tags for a given sequence tag length is lower than thetheoretical number of possible sequence tags. Sequence tag length may bedetermined, for example, by the number of reactive moieties to be taggedand/or the sequencing technology to be used in RD-PCR productsequencing. A library of nucleic acid molecules including sequence tagsmay include sequence tags of different length, thus increasing thenumber of usable sequence tags at any given maximum sequence tag length.The term “identifying a sequence tag” refers to determining thenucleotide sequence of a sequence tag.

As used herein, the term “small molecule” is used to refer to molecules,whether naturally-occurring or artificially created (e.g., via chemicalsynthesis) that have a relatively low molecular weight. Typically, asmall molecule is an organic compound (i.e., it contains carbon). Thesmall molecule may contain multiple carbon-carbon bonds, stereocenters,and/or other functional groups (e.g., amines, hydroxyl, carbonyls,heterocyclic rings, etc.). In some embodiments, small molecules aremonomeric and have a molecular weight of less than about 1500 g/mol. Incertain embodiments, the molecular weight of the small molecule is lessthan about 1000 g/mol or less than about 500 g/mol. Preferred smallmolecules are biologically active in that they produce a biologicaleffect, for example, a kinase inhibitor produces inhibition of a kinase,in animals, preferably mammals, more preferably humans. In certainembodiments, the small molecule is a drug. Preferably, though notnecessarily, the drug is one that has already been deemed safe andeffective for use in humans or animals by the appropriate governmentalagency or regulatory body. For example, drugs approved for human use arelisted by the FDA under 21 C.F.R. §§330.5, 331 through 361, and 440through 460, incorporated herein by reference; drugs for veterinary useare listed by the FDA under 21 C.F.R. §§500 through 589, incorporatedherein by reference. All listed drugs are considered acceptable for usein accordance with the present invention.

As used herein, the term “suitable conditions,” interchangeably usedwith the term “conditions suitable,” refers to conditions that aresuitable for a specific reaction, interaction, or other event to takeplace. For example, conditions suitable to form a covalent bond betweentwo reactive moieties may include both reactive moieties, a suitablemedium allowing both reactive moieties to interact, for example, anaqueous solution, a reaction cofactor or catalyst, if necessary, abuffering agent, a certain temperature, pH, or osmolarity. The suitableconditions for any given reaction or interaction will, of course, dependon the specific reaction or interaction. Suitable conditions for thereactions or interactions described herein are well known to those inthe relevant chemical and molecular biological arts. For example,suitable conditions for nucleic acid hybridization, primer extension,restriction digestion, and linker ligation are described herein and inSambrook et al., Molecular Cloning: A Laboratory Manual, Volumes 1-3,Cold Spring Harbor Laboratory Press, 2001, incorporated herein byreference. Further, suitable conditions for various chemical reactionsare described herein and, for example, in Smith and March, March'sAdvanced Organic Chemistry: Reactions, Mechanisms, and Structure,Wiley-Interscience, 6^(th) edition, 2007, incorporated herein byreference. Suitable conditions for covalent bond formation, enzymaticcatalysis, and PCR are described herein and well known to those of skillin the art. In some embodiments, suitable conditions for hybridizationof a nucleic acid template's primer hybridization site and a primerincluding a complementary nucleic acid sequence and/or primer extensionare conditions allowing for efficient primer site hybridization and/orprimer extension if the primer is covalently bound to the nucleic acidtemplate, but not allowing for efficient primer site hybridizationand/or primer extension if the primer is not covalently bound to thenucleic acid template.

The term “template,” as used herein, refers to a nucleic acid moleculeincluding a primer hybridization site. A template may also include(e.g., be coupled to) a candidate reactive moiety and may include a tag,such as a sequence tag, identifying the attached candidate reactivemoiety. A template is typically a DNA molecule.

DETAILED DESCRIPTION OF THE INVENTION

Recent advances in genome and proteome research have led to a dramaticincrease in the number of targets of interest to the life sciences. Therapid identification of reaction partners and ligands to this expandingnumber of targets is a major scientific and technological challenge. Tothis end, a variety of target-oriented high-throughput screening methodshave been developed. Two fundamental limitations to target-orientedscreening methods are (i) the requirement that each target of interestmust successively be assayed against libraries of potential ligands; and(ii) the general reliance on immobilized targets or ligands. The firstconstraint limits assay throughput significantly when researchers areinterested in multiple targets or in ligand specificity. The secondlimitation adds immobilization, washing, and/or elution steps to thescreening process and is a source of artifacts that arise, for example,from matrix binding, multivalent binding, or loss of native targetstructure. A solution-phase method to simultaneously reveal all reactivepairs or ligand-target binding pairs from a single solution containinglibraries of candidate reactive agents or ligands and libraries oftargets could in principle overcome both limitations and significantlyincrease the efficiency and effectiveness of target-oriented screeningefforts. This invention provide such systems, for example, RD-PCR andID-PCR.

Reactivity-dependent PCR (RD-PCR) and interaction-dependent PCR(ID-PCR), in general, exploit the discovery that a primer that is boundto a nucleic acid template can more efficiently hybridize to thetemplate and initiate replication of the template than a non-covalentlybound primer. That is, the melting temperature of a double-strandednucleic acid is substantially higher when hybridization isintramolecular as opposed to intermolecular.

RD-PCR and ID-PCR are systems that are useful in identifying reactionpartners or binding partners, respectively, from combined libraries ofcandidate reactive molecules or binding molecules. Formation of acovalent bond between a DNA template-linked reactive moiety and aprimer-linked reactive moiety or of a non-covalent association between aDNA template-linked target and a primer-linked ligand results in theformation of a molecule in which the primer can hybridize with thetarget intramolecularly, which induces formation of an extendableduplex. If the DNA-template comprises an identifiable tag, extensionlinks codes identifying the reactive moiety or the target molecule intoone selectively amplifiable DNA molecule.

Some RD-PCR methods described herein include (a) contacting a template(including a candidate reactive moiety) with a primer (including atarget reactive moiety) under conditions suitable for candidate reactivemoiety and target reactive moiety to form a covalent bond, (b) a primerhybridization and extension step, and (c) a subsequent PCR amplificationstep.

Some ID-PCR methods described herein include (a) contacting a template(including a candidate ligand) with a primer (including a targetmolecule) under conditions suitable for a candidate ligand to bind tothe target molecule, (b) a primer hybridization and extension step, and(c) a subsequent PCR amplification step.

Some aspects of the invention relate to strategies for theidentification of bond-forming and bond-cleaving reactivity in asingle-phase format. Some aspects of the invention relate to strategiesfor the identification of ligand binding activity in a single-phaseformat. Some aspects of this invention provide methods that obviate theneed for time- and/or work-intensive manipulations that conventionalselection strategies are burdened with. Some aspects of this inventionprovide systems for reactivity-dependent and interaction-dependentpolymerase chain reaction, reactivity and interaction identificationstrategies, that directly link bond formation, bond cleavage, or ligandinteraction with the amplification of desired sequences.

In some embodiments, RD-PCR includes the steps of (i) providing anucleic acid template including a first primer hybridization site, asequence tag, a second primer hybridization site, and a candidatereactive moiety; (ii) contacting the nucleic acid template with a firstprimer including a sequence complementary to the first primerhybridization site, a third primer hybridization site, and a targetreactive moiety; (iii) incubating the nucleic acid template contactedwith the first primer under conditions suitable for the candidatereactive moiety to form a covalent bond with the target reactive moiety;(iv) incubating the nucleic acid template contacted with the firstprimer under conditions suitable for covalently bound first primer tohybridize with the first primer hybridization site of the nucleic acidtemplate it is covalently bound to and for primer extension; (v)contacting the nucleic acid template contacted with the first primerwith a PCR primer complementary to the second primer hybridization siteand a PCR primer complementary to the third primer hybridization site orwith a PCR primer complementary to the second and third primerhybridization site; and (vi) performing a polymerase chain reaction toamplify the template, or a portion of the template, including thesequence tag.

In some embodiments, both the candidate and the target reactive moietyare provided in a reactive form (see, for example, amide-formationdependent RD-PCR described in FIG. 12 and related text in the Examplesection below). In some embodiments, the candidate reactive moiety isprovided in an inactive form (see, for example, FIG. 7 and related textin the Example section). Such embodiments, generally, include anadditional step of exposing the reactive moiety to conditions and/or areagent suitable to render the inactive reactive moiety active (i.e.,deprotecting).

Design of the template and primer may follow the specific parametersexemplified herein or may follow parameters for PCR template and primerdesign well known to those of skill in the art. Template design dependson the specific RD-PCR application to be performed. For example, in someembodiments, only a single reactive moiety may be tested for its abilityto form a covalent bond with a target reactive moiety. In theseembodiments, a template may be designed that does not include a sequencetag identifying the reactive moiety.

In some embodiments, a template may include a second primerhybridization site in addition to the primer hybridization sitecomplementary to a primer including a target reactive moiety (the firstprimer hybridization site). The second primer hybridization site may becomplementary to a PCR primer (a PCR primer hybridization site). In someembodiments, the template may include only one primer hybridization siteand a linker including the primer hybridization site complementary to aPCR primer may be ligated to the template at some point before PCRamplification.

In some embodiments, the primer including the target reactive moietyincludes a PCR primer hybridization site. In some embodiments, theprimer including the target reactive moiety does not include a PCRprimer hybridization site.

In some embodiments, the end-product of the covalent binding of targetmoiety to candidate moiety, first primer hybridization and extension isa molecule including a sequence tag flanked by PCR primer hybridizationsites (see, for example, FIG. 6, #14). In some embodiments, the PCRprimer hybridization sites include the same sequence. In someembodiments, only one PCR primer that hybridizes to the PCR primerhybridization sites flanking the sequence tag of the template is used inthe PCR step. In some embodiments, (a) a PCR primer that hybridizes to aPCR primer hybridization site flanking the sequence tag of the template,and (b) a PCR primer that hybridizes to a PCR primer hybridization siteon the other side of the sequence tag of the template is used in the PCRstep. In some embodiments, a PCR primer hybridizes to the first primerhybridization site (i.e., the site that the first primer hybridizes to.In such embodiments, a first primer may be provided that does notinclude a second primer hybridization site.

In some embodiments, a template is provided that includes a spacersequence between the first primer hybridization site and the candidatereactive moiety. In some embodiments, a first primer is provided thatincludes a spacer sequence between the sequence complementary to thefirst primer hybridization site and the target reactive moiety. In someembodiments both template and first primer include a spacer sequencebetween the respective reactive moiety and a sequence involved in primerhybridization. In some embodiments, the spacer sequence is designed toallow the first primer to hybridize with the first primer hybridizationsite of the template. A spacer sequence may be designed by methods wellknown to those of skill in the art. In general, a spacer sequence shouldnot be complementary to any sequence of the template or the primer.

Primer extension is generally carried out by a nucleic acid polymerase,for example, a DNA or RNA polymerase. In some embodiments, a templateand an extended primer are contacted with a PCR primer and exposed toconditions suitable to perform a polymerase chain reaction. Reagents andconditions for primer hybridization and extension are well known tothose of skill in the art.

In some embodiments, a nucleic acid template including a candidatereactive moiety and a primer including a target reactive moiety areprovided as separate (not covalently linked) molecules. In someembodiments, a nucleic acid template including a candidate reactivemoiety and a primer including a target reactive moiety are provided astemporarily linked (e.g., covalently or non-covalently linked)molecules, wherein the linkage is not via a covalent bond between thecandidate and the target reactive moiety and the template and primer arecontacted or exposed to conditions suitable to undo the linkage beforethe primer extension step. For an example of an embodiment includingtemporarily linked template and primer, see FIG. 6.

In some embodiments, the ability of a single candidate reactive moietyto bind to a target reactive moiety is determined. In such embodiments,a template may be used that does not include a sequence tag. In suchembodiments, the amplification of a template sequence during the PCRstep, for example, at a cycle number at which non-covalently boundprimer does not yield an amplified template sequence, may indicate thatthe candidate reactive moiety can covalently bind to the target reactivemoiety under the respective conditions.

In some embodiments, the ability of a plurality of candidate reactivemoieties to bind to a target reactive moiety is determined. In some suchembodiments, a plurality of nucleic acid templates, each linked to acandidate reactive moiety, is contacted with a first primer. In someembodiments, the sequence tag of each template identifies the candidatereactive moiety it includes. In some embodiments, the sequence tagsequence of a PCR product is determined, for example, by sequencingmethods well known to those of skill in the art, and a reactive moietyable to covalently bind to the target reactive moiety provided in therespective RD-PCR reaction is identified.

In some embodiments, RD-PCR is used to identify a substrate of anenzyme. In some embodiments, a template is provided that includes acandidate enzyme substrate, for example, a polypeptide or nucleic acidsequence. In some embodiments, the reaction catalyzed by the enzyme tobe investigated results in the formation of a reactive moiety as part ofthe candidate enzyme substrate. For example, hydrolysis of a covalentbond within the substrate may leave a reactive moiety, (e.g., an aminogroup, a carboxyl group, or a hydroxyl group) at the cleavage site. Foranother example, phosphorylation or methylation of a candidate substratemay generate a reactive moiety as part of the respective candidateenzyme substrate.

In some embodiments, RD-PCR is used to identify a substrate of aprotease (for example, see FIG. 7 and related text in the Examplesection). In some embodiments, a template is provided including acandidate amino acid sequence for a protease. In some embodiments,hydrolysis of a covalent bond within a candidate amino acid sequence bya protease results in the formation of a reactive moiety, for example,an amine group, able to form a covalent bond with a target reactivemoiety, for example, a carboxyl group. In some embodiments, a pluralityof templates including a candidate protease substrate peptide areprovided and contacted with a protease under conditions suitable for theprotease to bind and cleave its target peptides. Suitable conditions forin vitro protease binding and activity are well known to those of skillin the art for many different proteases. For example, suitableconditions for various proteases are described in Antalis et al.,Proteases in Cancer: Methods and Protocols (Methods in molecularbiology), Humana Press, 1^(st) edition, 2009. In some embodiments, alibrary of candidate protease substrate peptides is screened for theirability to be cleaved by a specific protease and the specific candidatesubstrate is identified by the sequence tag of the template. In someembodiments, a specific candidate protease substrate peptide may beincluded in a template including a specific sequence tag, thus allowingfor identification of a candidate substrate that is a target of aspecific protease by the sequence tag sequence of a RD-PCR productresulting from an RD-PCR procedure employing the protease. In someembodiments, a plurality of identified protease substrate sequences fora specific protease are combined into a list, or a substrate profile, ofthe protease. The identification of a protease substrate sequence, orthe generation of a protease substrate profile may be useful inpredicting protease target structures and/or identifying a molecular invivo target of a protease. In some embodiments, where a plurality ofcandidate substrates is screened, a plurality of RD-PCR products withdifferent sequence tag sequences is amplified, and a plurality ofprotease substrate sequences are identified, the sequences may becompared and aligned, and/or a consensus sequence may be generated fromthe sequences identified. Methods to generate consensus sequences from alist of sequences are well known to those of skill in the art.Quantitative information from the RD-PCR procedure, for example, therelative amounts of a template with a specific sequence tag among allamplified templates, may be used to determine enzyme preference for aspecific substrate. Methods of reflecting enzyme preference in consensussequence calculations are well known to those in the art.

In some embodiments, RD-PCR is used to identify a substrate of afunctional nucleic acid (e.g., a ribozyme or a DNAzyme). In someembodiments, a template is provided including a candidate substrate fora functional nucleic acid. In some embodiments, hydrolysis of a covalentbond within a candidate substrate by a functional nucleic acid resultsin the formation of a reactive moiety, for example, an amine group, acarboxyl group, a phosphate group, or a hydroxyl group, able to form acovalent bond with a target reactive moiety. In some embodiments, aplurality of templates including a candidate substrate are provided andcontacted with a functional nucleic acid under conditions suitable forthe functional nucleic acid to bind its substrate and catalyze thedesired reaction (e.g. cleavage or formation of a covalent bond).Suitable conditions for in vitro nucleic acid enzyme binding andactivity are well known to those of skill in the art for variousfunctional nucleic acids, such as ribozymes and DNAzymes. For example,suitable conditions for various ribozymes and DNAzymes are described inSioud, Ribozymes and siRNA protocols, Humana Press, 2004, and Brakmannet al., Evolutionary methods in Biotechnology, Wiley-VCH, 2004, both ofwhich are incorporated herein by reference. In some embodiments, alibrary of candidate substrates, for example, a variety of candidatenucleic acid or amino acid sequences, is screened for an actualsubstrate of a specific functional nuclei acid. An actual substrate canbe identified by the sequence tag of the template it was coupled toafter amplification of the template in a polymerase chain reaction. If aplurality of actual substrates is identified for a given functionalnucleic acid, enzyme profiling and determination of consensus substratesequence, if applicable, may be performed as outlined above.

For example, in some embodiments, a nucleic acid template is providedthat includes a first primer hybridization site, a sequence tag, acandidate substrate nucleic acid sequence, and a PCR primerhybridization site. In some embodiments, the template is contacted witha primer that includes a reactive moiety that can form a covalent bondto a nucleic acid. In some embodiments, the template is designed in amanner that the end at which the candidate substrate is situated ismodified in a manner precluding the formation of a covalent bond betweenthe reactive moiety of the first primer and the template. In someembodiments, the template is contacted with an endonuclease thatcatalyzes the cleavage (e.g., by hydrolysis) of an internucleotide bondin its substrate sequence. In some embodiments, cleavage of aninternucleotide bond within a substrate nucleic acid sequence leaves anunmodified nucleic acid end (e.g., a free hydroxyl group) to which thereactive moiety of the first primer can form a covalent bond. In someembodiments, a part of the template that included actual substratenucleic acid sequences is then amplified in the polymerase chainreaction and the sequence of the substrate nucleic acid sequence isidentified by the sequence tag of the template.

In some embodiments, RD-PCR is used to identify a functional nucleicacid that can catalyze a specific reaction on a given substrate. In someembodiments, a template is provided including a candidate functionalnucleic acid, a first primer hybridization site, and, optionally, asequence tag, and a PCR primer hybridization site. In some embodiments,the functional nucleic acid is a cis-acting nucleic acid and thetemplate also includes a specific substrate. In such embodiments, thetemplate is contacted with a first primer coupled to a reactive moietyable to form a covalent bond to the template only if the candidatefunctional nucleic acid has catalyzed a specific reaction on thesubstrate. In other embodiments, a specific substrate is providedcoupled to the first primer. In such embodiments, the template comprisesa reactive moiety able to form a covalent bond to the first primer onlyafter the candidate functional nucleic acid has catalyzed a reaction onthe substrate. In some embodiments, a part of the template including thefunctional nucleic acid is amplified in the polymerase chain reaction.In some embodiments, the functional nucleic acid sequence is determinedby the sequence tag of the template. In some embodiments, the sequenceof the functional nucleic acid is amplified as part of the template inthe polymerase chain reaction and can be determined directly bysequencing.

In some embodiments, a library of templates is provided including acandidate functional nucleic acid, for example, a ribozyme or a DNAzyme,a first primer hybridization site, a reactive moiety able to form acovalent bond with a nucleic acid, but not with any other nucleic acidtemplate provided in the library, and, optionally, a sequence tag. Insome embodiments, the template is contacted with a first primerincluding a specific substrate, for example, a specific nucleic acidsequence, wherein the first primer is designed in a manner that itcannot form a covalent bond with the provided template unless thesubstrate is modified. For example, in some embodiments, the 5′-end ofthe first primer includes a protecting group precluding ligation of thefirst primer to a template molecule. Only after modification of thesubstrate, for example, by cleavage of an internucleotide bond by afunctional nucleic acid, can a covalent bond be formed. In someembodiments, the template is contacted with the first primer underconditions suitable for a functional nucleic acid to bind its substrateand catalyze a chemical reaction, for example, cleavage of aninternucleotide bond, and for ligation of the first primer to thenucleic acid template. In some embodiments, a part of the template isamplified in a subsequent polymerase chain reaction, and the sequence ofa functional nucleic acid is determined by the sequence tag associatedwith it, or by directly sequencing the functional nucleic acid portionof the template.

Suitable conditions for an enzyme to bind and react with a substratemolecule depend, of course, on the nature of the enzyme and thesubstrate and the reaction being catalyzed. Suitable conditions for manyenzymes and enzyme types are known to those of skill in the art. Forexample, suitable conditions for various proteases are described inAntalis et al., Proteases in Cancer: Methods and Protocols (Methods inmolecular biology), Humana Press, 1^(st) edition, 2009. Similarly,suitable conditions for enzymes the activity of which has been examinedin a published in vitro assay, can be extrapolated from the respectivepublication. Conditions for enzymes similar to those for which in vitroassay conditions are known, can be extrapolated from assay parameters ofclosely related enzymes. In general, mimicking aspects of thephysiological conditions under which an enzyme functions in vivo willprovide suitable conditions for an in vitro assay and, thus, for RD-PCR.

In some embodiments, a PCR is performed with a template and a primerthat interact not via covalent, but via non-covalent interaction, asshown in FIG. 16. Such embodiments are referred to as“interaction-dependent PCR,” or ID-PCR. Non-covalent interactionssuitable for ID-PCR reactions include, for example, interactions ofchemical or biological ligands (e.g., interactions between a protein anda small chemical compound, between two chemical compounds, between anenzyme and its substrate, between a protein and its ligand, between anantibody and its epitope, etc.). Non-limiting examples of interactionssuitable for ID-PCR include hydrogen bonds, electrostatic interactions,magnetic interactions, π-stacking interactions, dipole-dipoleinteractions, hydrophobic interactions, van der Waals interactions, orcombinations thereof. In some embodiments, the non-covalent interactionis a direct interaction between a ligand coupled to a nucleic acidtemplate and a target molecule coupled to a first primer. In someembodiments, the non-covalent interaction is an indirect interactionbetween a ligand coupled to a nucleic acid and a target molecule coupledto a first primer, for example, via a third molecule that both (ligandand target molecule) interact with. In some such embodiments, ligand andtarget molecule are of identical structure and bind to a multivalentbinding molecule. Typically, non-covalent interactions suitable forID-PCR are characterized by a K_(D)<10⁻⁶. In some ID-PCR embodiments,such non-covalent interactions are characterized by a K_(D)<10⁻⁷. Insome ID-PCR embodiments, such non-covalent interactions arecharacterized by a K_(D)<10⁻⁸. In some ID-PCR embodiments, suchnon-covalent interactions are characterized by a K_(D)<10⁻⁹. In someID-PCR embodiments, such non-covalent interactions are characterized bya K_(D)<10⁻¹⁰. In some ID-PCR embodiments, such non-covalentinteractions are characterized by a K_(D)<10⁻¹¹. In some ID-PCRembodiments, such non-covalent interactions are characterized by aK_(D)<10⁻¹². In some ID-PCR embodiments including non-covalentinteraction formation, the interaction between both binding partners(one coupled to the template, the other to the first primer) forms inthe absence of any enzymatic activity. In some embodiments, a library oftemplates coupled to a candidate ligand, for example, a small compoundcandidate ligand, is screened using ID-PCR for an actual ligand of agiven protein coupled to the first primer.

Suitable conditions for non-covalent interactions, primer hybridizationand extension will depend, of course, on the nature of the non-covalentinteraction to be screened for. Typical conditions for the screening ofvarious types of non-covalent interactions are well known in therelevant arts and have been documented, for example, in numerouspublications regarding screens of compound or protein libraries usingconventional methodology, such as Cabilly, Combinatorial Peptide LibraryProtocols, Humana Press, 1998, and Janzen, High Throughput Screening:Methods and Protocols, Humana Press, 2002, both of which areincorporated herein by reference.

In some embodiments, ID-PCR is used to screen two libraries of bindingpartners against each other. For example, in some ID-PCR embodiments, alibrary of templates is provided including candidate polypeptides. Insome embodiments, the specific candidate polypeptide a template iscoupled to can be identified by the template's sequence tag. In someembodiments, the candidate polypeptides are rationally selected, forexample, only polypeptides representing proteins, or fragments ofproteins, from a biological pathway relevant for a specific disease maybe included in a specific library. In some embodiments, onlypolypeptides that are either proteins, or fragments of proteins, thatactivate or inhibit of a specific biological pathway are included in thetemplate library. In other embodiments, the template library is notrationally selected, but may, for example, be a library of randomizedcompounds, peptides, or nucleic acids. In some embodiments, the libraryof templates is contacted with a library of primers coupled to candidateligands, for example, peptides or small chemical compounds. In someembodiments, the specific ligand a primer molecule is coupled to can beidentified by a sequence tag included in the primer. In someembodiments, the library of candidate polypeptides is contacted with thelibrary of candidate ligands under conditions suitable for binding of apolypeptide to a ligand. The ID-PCR reaction amplifies a sequenceincluding a sequence tag from the template (identifying the polypeptide)and a sequence tag from the primer (identifying the ligand). In someembodiments, ID-PCR is used to identify polypeptide-ligand pairs insimultaneous screens of multiple libraries. Thus, ID-PCR is useful inidentifying binding partners (targets) of leads in drug development.

In some embodiments, RD-PCR or ID-PCR is used as an environmentalsensor. In some embodiments of RD-PCR, a template is provided includinga reactive moiety that, under suitable conditions, only forms a covalentbond to a reactive moiety coupled to a first primer in the presence orabsence of an environmental parameter or analyte, for example, in thepresence of a specific metal ion, in the presence of an oxidizingenvironment, in the presence of an enzymatic activity, or in thepresence of an environmental toxin or pathogen. In some embodiments ofID-PCR, a template is provided including a ligand that, under suitableconditions, only interacts with a target molecule coupled to a firstprimer in the presence or absence of an environmental parameter oranalyte as described herein. In some ID-PCR embodiments, thetemplate-coupled ligand does not directly interact with theprimer-coupled target molecule, but interacts with the target moleculeindirectly, for example, via a third molecule that interacts with boththe ligand and the target molecule. In some such embodiments, thetemplate-coupled ligand and the primer-coupled target molecule share thesame binding domain structure and interact via a multivalent bindingmolecule. In some embodiments, the polymerase chain reaction depends onthe formation or the cleavage of a covalent bond catalyzed by an enzymeonly in the presence or the absence of an enzymatic co-factor.Co-factors of enzymes are well known in the art and include, forexample, inorganic cofactors and organic cofactors. Inorganic cofactorsinclude, for example, ions of Li, Na, K, Mg, Ca, Sr, Ba, Cr, Mn, Fe, Co,Ni, Cu, Zn, Cd. Organic cofactors are generally small organic molecules(typically with a mass of less than 1 kDa), and include, for example,vitamins and vitamin derivatives (e.g., thiamine, niacin, pyridoxine,lipoic acid, cobalamine, biotin, pantothenic acid, folic acid,menaquinone, ascorbic acid, riboflavin, and their derivatives), andother organic cofactors (e.g. ATP, coenzyme B, M, and Q, glutathione,tetrahydrobiopterin, and methanofuran). In some embodiments, a biosensorRD-PCR or ID-PCR reaction is designed in a manner that the formation ofthe covalent bond or the interaction between first primer and templatedepends on a reaction catalyzed only in the presence of a specificcofactor. In some embodiments, a sample from the environment to betested is obtained and added to the RD-PCR or ID-PCR reaction either inits original form or in a processed form, depending on the nature of theenvironmental parameter to be investigated. In some embodiments, if theRD-PCR or ID-PCR reaction yields an amplicon at a predefined PCR cyclenumber, then it is determined that the environmental sample containedthe environmental parameter or analyte at the time the sample was taken.In some embodiments, if the RD-PCR or ID-PCR reaction does not yield anamplicon at a predefined cycle number, then it is determined that theenvironment to be tested did not contain the environmental parameter oranalyte or contained it only below threshold concentration at the timethe sample was taken.

In some embodiments, a plurality of RD-PCR or ID-PCR template-primerpairs are provided in a multiplex reaction, wherein the reaction orinteraction of different template-primer pairs depends on the presenceor the absence of different analytes, thus allowing one to analyzemultiple analytes or environmental parameters in parallel. For example,two template-primer pairs may be provided, wherein formation of acovalent bond between template and primer of the first pair depends onthe presence of a first analyte and formation of a covalent bond betweentemplate and primer of the second pair depends on the presence of asecond analyte. In some embodiments, a sequence tag is included in thetemplate identifying the analyte involved in the reaction or theinteraction between the template and the primer. In some embodiments,the presence or absence of a specific analyte is determined byidentifying the sequence tag of an amplicon from a multiplex RD-PCR orID-PCR reaction. In some embodiments, different template-primer pairs ofa multiplex RD-PCR or ID-PCR reaction are designed to yield amplicons ofdifferent length, allowing a determination of the presence or absence ofa specific analyte or environmental condition by the length of theamplicon yielded in the polymerase chain reaction. For example, a firsttemplate-primer pair may be designed to yield an amplicon of about 250base pairs in the presence of a first analyte and a secondtemplate-primer pair may be designed to yield an amplicon of about 500base pairs in the presence of a second analyte. In this exemplaryembodiment, an amplicon of about 250 base pairs is indicative of thepresence of the first analyte and an amplicon of about 500 base pairs isindicative of the presence of the second analyte. Methods fordetermining the length of a PCR amplicon are well known in the art andinclude, for example, gel-electrophoresis.

In some embodiments, a kit containing RD-PCR or ID-PCR reagents areprovided. In some embodiments, a kit is provided containing a nucleicacid molecule, for example, a nucleic acid template or a primer. In someembodiments, a kit is provided containing a primer coupled with a targetreactive moiety. In some embodiments, a kit is provided containing anucleic acid template coupled with a candidate reactive moiety. In someembodiments, a kit is provided containing a library of candidatereactive moieties for RD-PCR or a library of candidate ligands forID-PCR, for example, a plurality of nucleic acid templates coupled witha candidate reactive moiety or a candidate ligand. In some embodiments,a kit is provided containing a reagent for coupling a nucleic acid, forexample, a nucleic acid template or primer, with a reactive moiety or abinding molecule, for example, a candidate or target reactive moiety ora candidate ligand or target molecule. In some embodiments, a catalystor an enzyme are provided. In some embodiments, reagents to generatesuitable conditions for a chemical reaction or interaction, or for aspecific catalyst or enzyme are provided. In some embodiments, a kit isprovided containing PCR reagents, for example, a PCR primer, a PCRbuffer, a PCR enzyme, for example, a thermophilic DNA polymerase, and/ora salt or co-factor necessary for the function of the PCR enzyme, forexample, Mg²⁺. In some embodiments, instructions containing informationor protocols for the use of the kit to perform RD-PCR or ID-PCR areprovided.

EXAMPLES Example 1 Reactivity-Dependent PCR: Direct, Solution-Phase InVitro Selection for Bond Formation

RD-PCR is based on the well-established observation that the meltingtemperatures (T_(m)) of double-stranded nucleic acids are substantiallyhigher when hybridization occurs intramolecularly as opposed tointermolecularly.(10) For example, the DNA hairpin 1 with an 8 bp stemis predicted(11) to exhibit a T_(m) of 48° C., while the intermolecularhybridization of two DNA strands of the same sequence (2 and 3) ispredicted to be far less favorable, with a T_(m) of only 11° C. (FIG.2). The significant difference in intramolecular versus intermolecularduplex stability could enable a new type of in vitro selection, whereinbond formation or bond cleavage is transduced into the formation of aself-priming DNA hairpin. This hairpin enables the selective PCRamplification of those DNA sequences that encode the reactive species(FIG. 2b ).(12)

The ability of intramolecular self-priming to result in preferential DNAamplification was assessed first. A series of oligonucleotide pairs wassynthesized (4a+5), each predicted to hybridize intermolecularly attheir 3′ ends to form a short duplex region of 8 or 10 bp (FIG. 3).

The 5′-end of each DNA oligonucleotide in the pair contained a sequenceidentical to either primer 6 or primer 7. PCR amplification cannot occuruntil after DNA hybridization and 3′ extension take place to generate asingle-stranded DNA molecule containing both primer 6 and a sequencecomplementary to primer 7. This 3′-extended species can then hybridizewith primer 7 and initiate PCR amplification.

An analogous series of DNA oligonucleotides capable of hybridizingintramolecularly to form hairpin structures with 10-, 8-, or 6-bp stemswas prepared (8a-8c). As with the intermolecularly hybridizingoligonucleotides, PCR amplification must be initiated by primerextension of the 3′-end. Quantitative, real-time PCR(13) (qPCR) was usedto compare the ability of these oligonucleotides to undergo PCRamplification.

Consistent with our initial hypothesis, under identical PCR conditionsand with equal starting concentrations of DNA, the intramolecularlyhybridizing templates were amplified much more efficiently than theirintermolecular counterparts (8a vs 4a+5a, 8b vs 4a+5b). Theintramolecularly hybridizing templates reached a threshold level ofamplified product 13 to 15 PCR cycles (C_(T)) earlier than theintermolecular templates, corresponding to a >2(13)-fold (>8000-fold)difference in effective initial template abundance. These qPCR resultswere corroborated by PAGE analysis; after 25 cycles of PCR, amplifiedproduct was only detected in reactions containing hairpin DNA.Collectively, these findings demonstrate that intramolecularlyhybridizing templates can be amplified to abundant levels underconditions that fail to appreciably amplify the correspondingintermolecularly hybridizing templates. Subsequent experiments in thiswork were carried out with an 8-base stem, which was found to optimallybalance robust intramolecular priming and poor intermolecular priming(see below).

To use RD-PCR in a general selection for bond formation, the covalentlylinked functional groups in the hairpin loop must not interfere with therequired hybridization and 3′-extension events. To test thecompatibility of non-natural linkers with the preferential amplificationof self-priming templates, the qPCR experiment in FIG. 3 were repeatedwith a series of non-natural linker structures, including ether,disulfide, and amide hairpin linkers. In all cases tested, DNA templatescontaining non-natural linkers (9a-9c) were far more efficientlyamplified than the analogous intermolecularly hybridizing templates(4+10) (FIG. 4).

Since RD-PCR will ultimately be applied to mixtures of both active(resulting in DNA hairpins) and inactive (resulting in separate linearoligonucleotides) library members, the ability of hairpin templates toundergo preferential amplification in the presence of large excesses ofcorresponding linear molecules was tested next (FIG. 5). For theselibrary-format experiments, the hairpin (11) and linear (12) templatesequences vary only by a single base, such that DNA amplified from thehairpin contains a cleavage site for the restriction enzyme HindIII,while DNA arising from 12 does not.

The quantity of 11 spiked into an equimolar mixture of 12 and 13 wasvaried to determine the selectivity of RD-PCR in a library format. Aslittle as 1 attomol of 11 (600 000 molecules) could be selectivelyamplified in the presence of a 10 000-fold excess of 12 and 13. The useof larger quantities of hairpin (corresponding to a 10- to 1000-foldexcess of 12 and 13) overwhelmingly provided the desired product. Theseresults demonstrate the ability of hairpin templates to bepreferentially amplified even in the presence of large excesses oflinear templates and indicate that RD-PCR can be applied tolibrary-format selections.(14)

Following these studies, RD-PCR was applied to two model in vitroselections. First, RD-PCR was validated as a bond-formation selectionfor DNA-encoded reaction discovery. Previously, DNA-encoded reactiondiscovery required the capture, washing, and elution of active librarymembers on avidin-linked beads (FIG. 1b ).(5) In a RD-PCR version ofreaction discovery, pairs of functional groups are attached to encodingDNA strands (FIG. 6). A disulfide linker temporarily joins eachsubstrate pair (9b, 9d). Exposure to a set of reaction conditions andsubsequent cleavage of the disulfide bond provide one of two possibleoutcomes. If a new covalent bond has formed between the functionalgroups, the hairpin-forming nucleotides remain tethered together throughthe reaction product, leaving a self-priming DNA hairpin (14). If nobond has formed, then only intermolecular hybridization is possible(15+16), resulting in inefficient PCR amplification. In contrast withprevious reaction discovery selections, the RD-PCR version requires nosolid-phase steps and minimal manipulation.

To test the ability of RD-PCR to support reaction discovery, adisulfide-linked substrate (9d) with pendant alkene and aryl iodidegroups was synthesized, which should undergo a Pd-mediated Heck-typereaction (FIG. 6).(15) An unreactive control substrate (9b) containingan azide and an aldehyde was similarly generated. Each substrate wastreated with 1 mM Na₂PdCl₄ (which is reduced to Pd(0) in situ) inaqueous pH 7.5 buffer for 30 min at 65° C., followed by DTT to cleavethe disulfide bond. The resulting material was subjected to PCR. The DNAattached to the alkene-aryl iodide pair (9d) amplified efficiently (FIG.6). Omission of Na₂PdCl₄ resulted in much less efficient amplification.Likewise, the unreactive substrate pair 9b did not undergo PCRamplification after identical treatment. Omission of DTT, however,enabled the disulfide-linked starting substrate to amplify efficiently.Collectively, these results indicate that RD-PCR can selectively andefficiently amplify DNA templates that have undergone bond formation andthat amplification is dependent on the intramolecularity of theresulting template-primer species.(16) These experiments werecorroborated using an azide/alkyne substrate that undergoes aCu(I)-catalyzed cycloaddition reaction (see below).

In principle, RD-PCR can also enable efficient selection for bondcleavage, which has yet to be studied in a DNA-encoded context. Toexplore this possibility, the ability of DNA-linked peptides to undergocleavage mediated by a protease was evaluated (FIG. 7).Protease-mediated cleavage of a DNA-peptide conjugate would expose aprimary amine group, which would then undergo DNA-templated amide bondformation to generate a hairpin template for efficient PCR.(17) Incontrast, the absence of proteolysis should result in no amide formationand thus inefficient PCR amplification.

A DNA-N-acetyl-pentapeptide conjugate (17), synthesized by solid phasecosynthesis, was exposed to subtilisin A. The peptide sequence(Ac-N-AFGPA) was designed to include cleavage sites for subtilisinA.(18) The enzyme-treated DNA was combined with a carboxylic acid-linkedDNA primer (10c) under conditions (DMT-MM or sNHS+EDC) that supportDNA-templated amide bond formation.(19)

Addition of the protease-digested and carboxylate-ligated DNA-peptideconjugate (18) to a PCR reaction resulted in efficient PCRamplification. In contrast, no PCR product was detected by PAGE whenunfunctionalized DNA (4a) was used in place of the pentapeptide or whensubtilisin A was omitted. Likewise, omission of the amide formationreagents also resulted in inefficient PCR amplification, consistent withthe necessity of intramolecular primer hybridization for rapidamplification. These findings together demonstrate the ability of RD-PCRto rapidly detect DNA-linked peptide substrates of protease enzymes.

In conclusion, RD-PCR was developed and validated as a new, entirelysolution-phase method for the selective amplification of DNA sequencesencoding molecules that undergo bond formation or bond cleavage.(9) Byobviating the need to perform solid-phase capture, washing, and elutionsteps, RD-PCR can greatly streamline the selection process forapplications such as DNA-encoded reaction discovery and proteaseactivity profiling. Compared with the performance characteristics ofprevious in vitro selection methods,(3c, 5a) the data above suggest thatRD-PCR may also offer superior enrichment factors (signal:backgroundratios).

Experimentals

General Methods.

DNA oligonucleotides were synthesized using standard automatedsolid-phase phosphoramidite coupling methods on a PerSeptive BiosystemsExpedite 8909 DNA synthesizer or purchased from Integrated DNATechnologies. All reagents and phosphoramidites for DNA synthesis werepurchased from Glen Research. Oligonucleotides were purified byreverse-phase high-pressure liquid chromatography (HPLC) using a C18stationary phase and an acetonitrile/100 mM triethyl ammonium acetategradient or by Oligonucleotide Purification Cartridge (AppliedBiosystems). Oligonucleotide concentrations were quantitated by UVspectroscopy on a Nanodrop ND1000 Spectrophotometer. Non-commercial,modified oligonucleotides were characterized by LCMS on Waters AquityUPLC equipped with a Waters Aquity UPLC BEH C18 column using an aqueous6 mM tetraethyl ammonium bicarbonate/MeOH mobile phase. Electrospraymass spectrometry was carried out on a Waters Q-TOF premier instrument.All DNA sequences are written in the 5′ to 3′ orientation.

Gels stained with ethidium bromide were visualized on an Alpha InnotechAlphaImager HP. Fluorescence images were acquired on a GE Typhoon Triovariable mode imager. Solid phase peptide synthesis was carried out onan Applied Biosystems 433A peptide synthesizer using standard Fmocchemistry. Water was purified with a Milli-Q purification system. Allchemical reagents were purchased from Sigma-Aldrich, unless otherwisenoted. DMT-MM was synthesized according to the method of Kunishima, M.;Kawachi, C.; Iwasaki, F.; Terao, K.; Tani, S. Tetrahedron Lett. 1999,40, 5327. Subtilisin A was purchased from Sigma-Aldrich. HindIII andPvuII-HF were purchased from New England Biolabs.

General Method for PCR.

All PCR reactions were carried out with AmpliTaq Gold DNA Polymerase(0.1 μL/20 μL reaction volume, Applied Biosystems) in the providedbuffer. PCR reactions included Mg2+ (3 mM), dNTPs (200 μM each, BioRad),and primers (500 nM each). Templates were amplified from a standardinitial concentration of 625 pM, unless otherwise noted. The thermalcycling sequence was as follows: 95° C. for 10 minutes, then iteratedcycles of 95° C. for 30 seconds, 58° C. for 30 seconds, and 72° C. for30 seconds. In preparative PCR reactions, upon completion of theiterated cycles, a final incubation at 72° C. for 2 minutes wasperformed. For qPCR, conditions were identical to those above, exceptthat Sybr Green I Nucleic Acid gel stain (0.5× final concentration froma 10,000× stock solution, Invitrogen) was added to the reaction mixture.Quantitative PCR experiments were performed in triplicate on a BioRadCFX96 Real-Time PCR Detection System.

Oligonucleotide Modeling Program (OMP) Calculation (FIG. 2a ):

Sequence for Hairpin Architecture (Complementary Region in Bold)

1:  (SEQ ID NO: 2) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG TCT GAC TAC AGA GTG GGA TGC ATA GAA C Sequences for Intermolecular Architecture 2:  (SEQ ID NO: 50)GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG 3:  (SEQ ID NO: 3)TCT GAC TAC AGA GTG GGA TGC ATA GAA C OMP calculation was performed using the following parameters: assaytemperature: 37° C. Mg2+: 2 mM. Monovalent cations: 0.1 M. DNAconcentration: 10 nM.

Hairpin Vs. Intermolecular Architecture Comparison (FIG. 3): PrimerSequences

6:  (SEQ ID NO: 4) GCT GAC TAC AGA GTG GGA TG  7:  (SEQ ID NO: 5)GCA GTA CCA ACC CTG TAC AC 

Sequences for Intermolecular Architecture

4a:  (SEQ ID NO: 6) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG 5a:  (SEQ ID NO: 7) GCT GAC TAC AGA GTG GGA TGC ATA GAA CTT (10 by duplex)  5b:  (SEQ ID NO: 8)GCT GAC TAC AGA GTG GGA TGC ATA GAA C (8 by duplex) 

Sequences for Intramolecular Architecture

8a:  (SEQ ID NO: 9) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG GCT GAC TAC AGA GTG GGA TGC ATA GAA CTT  (10 by duplex)  8b: (SEQ ID NO: 10) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG GCT GAC TAC AGA GTG GGA TGC ATA GAA C  (8 by duplex)  8c: (SEQ ID NO: 11) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG GCT GAC TAC AGA GTG GGA TGC ATA GA (6 by duplex) 

qPCR:

The appropriate hairpin 8 (625 pM) or a 1:1 mixture of 4a and 5 for theintermolecular cases (625 pM each) were subjected to qPCR under thestandard conditions. PAGE: The appropriate hairpin 8 or a 1:1 mixture of4a and 5 for the intermolecular cases were subjected to 25 cycles of PCRunder the standard conditions. The reactions were analyzed by PAGE (10%TBE gel, 200 V, 20 minutes).

Non-Natural Linker Experiments: Oligo-Ethylene Glycol (FIG. 4)

Primer Sequences

S1:  (SEQ ID NO: 12) GCA GTA CCA ACC CTG TAC AC  S2:  (SEQ ID NO: 13)CCT GAC TAC AGA GTG GGA TG 

Sequences for Intermolecular Architecture

4a:  (SEQ ID NO: 14) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG 10a:  (SEQ ID NO: 15)CCT GAC TAC AGA GTG GGA TGC ATA GAA C (8 by duplex)  S3: (SEQ ID NO: 16) CCT GAC TAC AGA GTG GGA TGC ATA GAA TT  (10 by duplex) Sequences for Hairpin Architecture (s9=Spacer Phosphoramidite 9, GlenResearch)

S4:  (SEQ ID NO: 17) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG s9 CCT GAC TAC AGA GTG GGA TGC ATA GAA CTT  (10 by hairpin)  9a: (SEQ ID NO: 18) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG s9 CCT GAC TAC AGA GTG GGA TGC ATA GAA C   (8 by hairpin) S5: (SEQ ID NO: 19) GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG s9 CCT GAC TAC AGA GTG GGA TGC ATA GA (6 by hairpin) 

Optimization of Stem Length

The appropriate hairpin was subjected to qPCR under the standardconditions. The substrates with 10- and 8-bp stems were amplified tothreshold detection levels in 17-18 cycles, while the 6-bp stem was farless efficient in initiating PCR (FIG. 8).

qPCR Standard Curve

To verify that the hairpin structures were well-behaved PCR templatesover a range of concentrations, a standard curve was generated by qPCR.Five different concentrations of hairpin 9a (625 pM, 31 pM, 1.6 pM, 80aM, 4 aM) were subjected to qPCR under the standard conditions, exceptthat a 64° C. annealing step was used instead of 58° C. The log ofstarting copy number was plotted vs. threshold cycle, and a linearfunction was fit to the data (FIG. 9).

Non-Natural Linker Experiments: Disulfide (FIG. 4)

Sequences

amine=Amino-Modifier Serinol Phosphoramidite;3′thiol=3′-Thiol-Modifier C6 S-S CPG;5′thiol=Thiol-Modifier C6 S-S

S6:  (SEQ ID NO: 20) GAG CTC GTT GAT ATC CGC AGA CAT GAG CCC CAC TAC ACACAC C (amine)(3′ thiol)  S7:  (SEQ ID NO: 21)(5′thiol)(amine)ACC TAA AGC TAG CAG CTG GCC GTG ATC AGC TTG GTG TGT G

Synthesis of 9b

The disulfide 9b was synthesized using the route shown in FIG. 10. S6and S7 were first functionalized with small-molecule carboxylic acidderivatives, providing 4b and 10b. A mixed disulfide was then generatedfrom 10b, which was reacted with the free thiol analog of 4b to yield9b.

Acylation of S6 and S7

The appropriate carboxylic acid (0.1 mmol) and N-hydroxyl-succinimide(0.1 mmol) were dissolved in 0.1 mL DMF in a 1.5 mL eppendorf tube.N,N′-dicyclohexylcarbodiimide (DCC) (0.1 mmol) was added, and theresulting mixture was agitated at room temperature for 30 minutes.During this time, a white precipitate formed. The reaction was brieflycentrifuged, and 0.05 mL of the supernatant was added to a solution ofS6 or S7 (10 nmol) in 0.1 mL of 0.2 M phosphate buffer (pH 8) in aseparate eppendorf tube. The resulting solution was vigorously agitatedfor 6 hours, and then diluted (0.5 mL total volume) prior topurification by Nap5 column. The recovered DNA was further purified byHPLC, typically yielding 1-2 nmol of the desired product (10b or 4b).4-azidobutyric acid (Kanan, M. W.; Rozenman, M. M.; Sakurai, K.; Snyder,T. M.; Liu, D. R. Nature 2004, 431, 545) was coupled to S6 to give 4b.4-formylbenzoic acid was coupled to S7 to give 10b.

Ligation of 4b and 10b

A solution of 4b (500 pmol) and DL-dithiothreitol (DTT, 0.3 M) in 0.1 mLof HEPES buffer (0.1 M, pH 8.5) was agitated for 30 minutes in a 1.5 mLeppendorf tube. A second solution with 10b (instead of 4b) was treatedequivalently. The deprotected, thiol-containing DNA was precipitatedwith ethanol from each reaction. The residue containing 10b was taken upin 0.1 mL of 9:1 50 mM TrisHCl (pH 7.5):ethanol containing 10 mM2,2′-dipyridyl disulfide. The resulting suspension was agitated for 1hour, and the DNA was then recovered by ethanol precipitation. Theactivated residue containing 10b and the residue containing 4b wereseparately dissolved in 10 μL of 50 mM TrisHCl (pH 7.5), and theconcentration of DNA in each solution was determined. An equimolarquantity of the 4b solution was added to the solution of 10b (the totalvolume did not exceed 20 μL), and the resulting solution was agitated at4° C. for 12 hours. The DNA was precipitated with ethanol, and thedesired product (9b) was isolated by gel purification (3% AmbionAgarose-HR gel). Typically, 100 pmol of disulfide 9b was obtained.

qPCR Analysis

Primer Sequences

S8:  (SEQ ID NO: 22) GAG CTC GTT GAT ATC CGC AG  S9:  (SEQ ID NO: 23)ACC TAA AGC TAG CAG CTG GC 

The hairpin 9b (500 pM) or a 1:1 mixture of 4b and 10b (500 pM each) forthe intermolecular case was subjected to qPCR under the standardconditions. PAGE analysis: The appropriate hairpin 9b (500 pM) or a 1:1mixture of 4b and 10b for the intermolecular case was subjected to 20cycles of PCR under the standard conditions. The reactions were analyzedby PAGE (10% TBE gel, 200 V, 20 minutes) (FIG. 11).

Non-Natural Linker Experiments: Amide (FIG. 4)

Sequences

NH₂=3′ amino modifier C7 CPG, CO₂H=5′ carboxy modifier C10

4c.  (SEQ ID NO: 24)GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG-NH₂  10c. (SEQ ID NO: 25) CO₂H-CTG AGC TCG TTG ATA TCC GCA GCA TAG AAC 

Formation of 9c from 4c and 10c

To a 45 μL solution of 4c (2.5 pmol) and 10c (3.75 pmol) in 8:1 MOPSbuffer (100 mM MOPS, 1 M NaCl, pH 7.5):CH₃CN was added 5 μL of a 0.14mg/μL solution of DMTMM in MOPS buffer. The resulting solution (50 μLtotal volume, 9:1 MOPS:CH₃CN) was briefly vortexed and then left at 4°C. for 14 hours. The reaction was allowed to warm to room temperatureand diluted with 50 μL of 0.1M aqueous NaCl. The DNA was precipitatedwith ethanol prior to subsequent analysis. Control experiments wereperformed analogously, but with omission of DMT-MM or with substitutionof the amine-terminated DNA (4c) for hydroxyl-terminated DNA (4a).

qPCR Analysis

Primer sequences

S1:  (SEQ ID NO: 26) GCA GTA CCA ACC CTG TAC AC  S10:  (SEQ ID NO: 27)CTG AGC TCG TTG ATA TCC GCA G 

DNA from the acylation reaction (9c) was subjected to qPCR under thestandard conditions. DNA from control reactions was identically treated.A dramatic dependence of CT upon addition of DMTMM was observed,consistent with acylation and subsequent intramolecular priming throughthe amide linker (FIG. 12). Upon replacement of amine-modified DNA 4cwith unmodified, 3′-OH 4a, DMTMM did not influence PCR efficiency (FIG.12).

Optimization of DNA-Templated Amide Bond Formation.

Experiments with fluorescently-tagged, amine-modified DNA enabled directdetermination of conversion from amine-DNA to carboxylate-ligated DNA.

Sequences

Cy3=Cy3 Phosphoramidite (Glen Research)

S11:  (SEQ ID NO: 28) Cy3-GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATC-NH₂ S12:  (SEQ ID NO: 29)CO₂H-CCT GAC TAC AGA GTG GGA TGC ATA GAA C  S13:  (SEQ ID NO: 30)CO₂H-CCT GAC TAC AGA GTG GGA TGT TGA CCG T 

DMT-MM-Mediated Coupling of S11 and S12

To a 90 μL solution of S11 (5.0 pmol) and S12 (7.5 pmol) in 8:1 MOPSbuffer (100 mM MOPS, 1 M NaCl, pH 7.5):CH₃CN was added 10 μL of a 0.14mg/μL solution of DMT-MM in MOPS buffer. The resulting solution (100 μLtotal volume, 9:1 MOPS:CH₃CN) was briefly vortexed and then left at theappropriate temperature (see conditions, FIG. 13) for 14 hours. The DNAwas precipitated with ethanol prior to subsequent analysis.

sNHS/EDC-Mediated Coupling of S11 and S12

A 90 μL solution of S11 (5.0 pmol) and S12 (7.5 pmol) in 8:1 MES buffer(100 mM MES, 1 M NaCl, pH 6.0):CH₃CN was added to 0.3 mg of sNHS in a1.5 mL eppendorf tube. 10 μL of a 0.04 mg/μL solution of EDC in MESbuffer was then added. The resulting solution (100 μL total volume, 9:1MES:CH₃CN) was briefly vortexed and then left at the appropriatetemperature for 14 hours. The DNA was precipitated with ethanol prior tosubsequent analysis.

PAGE Analysis of Acylation Reactions

Denaturing PAGE (15% TBE-urea gel, 300 V, 25 minutes) and subsequentfluorescence quantitation was used to monitor the acylation reactions.Regardless of the acylation reagent employed, amide bond formation wasmarkedly more efficient at 4° C. compared with the room temperaturereactions (FIG. 13, lanes 1 vs. 2, 5 vs. 6). This result was consistentwith our expectation that the melting temperature of the 8 bpintermolecular duplex is ˜10° C., and therefore lower temperatureprovides higher reactivity by stabilizing the duplex intermediate.Furthermore, an experiment with a mismatched stem sequence (S11+S13),such that no intermolecular hybridization is possible, resulted in noamide product, demonstrating that DNA hybridization is essential to thereaction under our experimental conditions (FIG. 13, lane 4).

Library-Format Experiments (FIG. 5)

Sequences:

Stem-forming nucleotides are in bold, point mutations are italicized,and the HindIII recognition site is underlined.

6:  (SEQ ID NO: 31) GCT GAC TAC AGA GTG GGA TG  S10:  (SEQ ID NO: 32)CTG AGC TCG TTG ATA TCC GCA G 

Sequences for Intermolecular Architecture

12:  (SEQ ID NO: 33) GCT GAC TAC AGA GTG GGA TGA ATC TTC ATC TCA AGT TCT ATG  13:  (SEQ ID NO: 34) CTG AGC TCG TTG ATA TCC GCA GCA TAG AAC 

Sequences for Intramolecular Architecture

11:  (SEQ ID NO: 35) GCT GAC TAC AGA GTG GGA TGA AGC TTC ATC TCA AGT TCT ATG-spacer9-CTG AGC TCG TTG ATA TCC GCA GCA TAG AAC 

Restriction Digestion Experiments

PCR reactions (60 μL total volume) were performed with constantconcentrations of 12 and 13 (625 pM) but varying concentrations of 11(62.5 pM, 6.25 pM, 0.625 pM, 0.0625 pM). The appropriate cycle numberfor each reaction was determined by prior qPCR evaluation, such thateach reaction proceeded while exponential amplification was occurring,but not beyond, in order to minimize dynamic compression. Following PCR,each reaction was split into two aliquots (40 and 20 μL, respectively).To the larger aliquot was added 1 μL of HindIII in glycerol (10,000units/mL). The resulting solution was incubated at 37° C. for 1 h, andthen heated to 65° C. for 20 minutes to deactivate the enzyme. The otheraliquot was treated equivalently, but with omission of HindIII. Theresulting samples were analyzed by PAGE (10% TBE, 175 V, 25 minutes).

Library-Format PvuII Experiments

A parallel set of experiments were carried out to corroborate theHindIII digestion results. All sequences and procedures were identicalto the HindIII experiments, except for those noted below. A double pointmutation was used to distinguish the intermolecular and intramolecularsubstrates to take into account the lower sequence fidelity of PvuII.

Sequences Sequences for Intermolecular Architecture

S15: (SEQ ID NO: 36) GCT GAC TAC AGA GTG GGA TGC AAG TGC ATC TCA AGT TCT ATG 

Sequences for Intramolecular Architecture

S16:  (SEQ ID NO: 37) GCT GAC TAC AGA GTG GGA TGC AGC TGC ATC TCA AGT TCT ATG-spacer9-CTG AGC TCG TTG ATA TCC GCA GCA TAG AAC 

Restriction digestion of the PCR reactions was carried out with 1 μL ofPvuII-HF (10,000 units/mL stock in glycerol). Heat inactivation wasachieved by incubation at 80° C. for 20 minutes following the 1 hincubation at 37° C. Reactions were analyzed as described above (FIG.14).

Reaction Discovery Model System (FIG. 6)

Sequences

Amine=Amino-Modifier Serinol Phosphoramidite; 3′thiol=3′-Thiol-ModifierC6 S-S CPG; 5′thiol=Thiol-Modifier C6 S-S

S17:  (SEQ ID NO: 38) GAG CTC GTT GAT ATC CGC AGA GCG TTA TGG TCC GAC  ACA CAC C (amine)(3′thiol)  S18:  (SEQ ID NO: 39)(5′thiol)(amine)ACC TAA AGC TAG CAG CTG GCG AGG  TTC CAG ATG GTG TGT G S19:  (SEQ ID NO: 40) (5′thiol)(amine)ACC TAA AGC TAG CAG CTG GCC GCA  CAC TTT CTG GTG TGT G 

The free amine of each substrate above was coupled to a small-moleculecarboxylic acid derivative as described earlier. For the reactions with6-heptenoic acid and 10-undecynoic acid, the active ester was formed in0.15 mL DMF, and 0.1 mL of the resulting soluble fraction was added tothe DNA-NH₂, giving a total volume of 0.2 mL. This modification led toincreased yields, presumably due to greater solubility of thehydrophobic carboxylic acids in DMF.

6-heptenoic acid was coupled to S17 to give S20.4-iodophenylacetic acid was coupled to S18 to give S21.10-undecynoic acid was coupled to S19 to give S22.Three substrates were made by formation of a disulfide bond:4b was coupled to 10b to give 9b.S20 was coupled to S21 to give 9d.4b was coupled to S22 to give S23.

Reaction of Disulfide-Linked DNA with Pd

5 pmol of the substrate oligo (9b or 9d) was added to 0.1 mL of a 1 mMdisodium tetrachloropalladate solution in MOPS buffer (50 mM MOPS, 0.5 MNaCl, pH 7.5). The resulting solution was heated at 65° C. for 30minutes. The DNA was recovered by ethanol precipitation, and then takenup in 0.1 mL of a 0.3 M DTT solution in HEPES buffer (0.1 M HEPES, pH8.5). The reaction was left for 1 hour at 65° C. to ensure full cleavageof the disulfide bond, and the DNA was then recovered by ethanolprecipitation. qPCR of Pd-treated DNA

Primer Sequences

S8:  (SEQ ID NO: 41) GAG CTC GTT GAT ATC CGC AG  S9:  (SEQ ID NO: 42)ACC TAA AGC TAG CAG CTG GC 

DNA recovered from the Pd reactions and various controls was subjectedto qPCR under slightly modified conditions. The starting concentrationof DNA was 50 pM.

PAGE Analysis

DNA recovered from the Pd reactions and various controls was subjectedto 23 cycles of PCR under slightly modified conditions. The startingconcentration of DNA was 50 pM. The reactions were analyzed by PAGE (10%TBE gel, 200 V, 20 minutes).

Detection of Cu(I)-Catalyzed Huisgen Cycloaddition**

1 pmol of the substrate oligo (9b or S23) was added to 0.1 mL of a 9:1H₂O:CH₃CN solution containing 2 mM Cu(I)C1. After 30 minutes at roomtemperature, the reactions were subjected to ethanol precipitation. TheDNA pellet was taken up in 0.1 mL of 0.3 M DTT in HEPES buffer (0.1 MHEPES, pH 8.5) and heated for 1 hour at 65° C. to ensure full cleavageof the disulfide bond. The DNA was then recovered by ethanolprecipitation. (** a) Himo, F.; Loveli, T.; Hilgraf, R.; Rostovtsev, V.V.; Noodleman, L.; Sharpless, K. B.; Fokin, V. V. J. Am. Chem. Soc.2005, 127, 210. For validation of DNA-encoded reaction discovery withthis reaction, see Kanan, M. W.; Rozenman, M. M.; Sakurai, K.; Snyder,T. M.; Liu, D. R. Nature 2004, 431, 545.)

qPCR of Cu-Treated DNA

Primer Sequences

S8:  (SEQ ID NO: 43) GAG CTC GTT GAT ATC CGC AG  S9:  (SEQ ID NO: 44)ACC TAA AGC TAG CAG CTG GC 

DNA recovered from the Cu reactions and various controls was subjectedto qPCR under slightly modified conditions. The starting concentrationof DNA was 5 pM. PAGE Analysis DNA recovered from the Cu reactions andvarious controls was subjected to 24 cycles of PCR under slightlymodified conditions. The starting concentration of DNA was 5 pM. Thereactions were analyzed by PAGE (10% TBE gel, 200 V, 20 minutes).

Protease Activity Detection Model System (FIG. 7)

Primers

S1:  (SEQ ID NO: 45) GCA GTA CCA ACC CTG TAC AC  S10:  (SEQ ID NO: 46)CTG AGC TCG TTG ATA TCC GCA G 

Sequences

3a=3′amino modifier C7 CPG

DNA-peptide 17.  GCA GTA CCA ACC CTG TAC ACC ATC TCA AGT TCT ATG-3a-Ala-Pro-Gly-Phe-Ala-NHAc (nucleic acid sequence: SEQ ID NO: 47; amino acid sequence: SEQ ID NO: 48) Carboxylate 10c: (SEQ ID NO: 49) CO₂H-CTG AGC TCG TTG ATA TCC GCA GCA TAG AAC 

Synthesis of DNA-Peptide Conjugate 17

Compound 17 was synthesized by solid-phase co-synthesis. 0.2 μmol of 3′amino modifier C7 CPG was subjected to solid-phase peptide synthesis toinstall the pentapeptide on the Fmoc-amine group. The peptide synthesisincluded iterated rounds of Fmoc deprotection (20% pipiridine/NMP),coupling (Fmoc-amino acid/HATU/DIPEA/NMP), and capping (5% aceticanhydride and 6% 2,6-lutidine in NMP).

The CPG was then subjected to standard solid-phase DNA synthesis toinstall the oligonucleotide at the site of the DMT-protected hydroxylgroup. The substrate was cleaved from the resin by standard methods(NH4OH/methylamine). The 5′-DMT-protected DNA was then purified by HPLC.Following lyophilization and deprotection of the DMT group by standardmethods (3% TFA), the DNA was repurified by HPLC to yield 17 (26.6nmols).

DNA Detection of Bond Cleavage by Subtilisin A

DNA-peptide 17 (2.6 pmol/uL) was treated with subtilisin A (65 ng/uL) inPBS buffer (subsequent experiments demonstrated that lowerconcentrations (to 650 pg/uL)). of subtilisin A were sufficient toaffect bond cleavage under otherwise identical conditions). Afterincubation at 37° C. for 90 minutes, the DNA and enzyme were separatedby phenol/chloroform extraction. The DNA was recovered by ethanolprecipitation, and taken up in the appropriate buffer (MOPS or MES) foracylation with 10c under the conditions described earlier. QuantitativePCR and PCR/PAGE analysis were performed as described earlier.

TABLE 1 LCMS characterization of functionalized oligonucleotidesobserved ion's Expected Observed Compound charge m/z m/z  9c (3′-NH2) −61854.504 1854.484 10c (5′-CO2H) −5 1899.930 1899.963 17 (DNA-peptide) −71658.732 1658.683  4b (3′SSR) −6 2290.563 2290.628 10b (5′SSR) −62174.367 2174.353 S20 −6 2307.401 2307.337 S21 −5 2641.629 2641.669 S22−6 2166.382 2166.339

REFERENCES

-   1. Seminal reports:-   a. Tuerk C.; Gold L. Science 1990, 249, 505.-   b. Ellington A. D.; Szostak J. W. Nature 1990, 346, 818.-   c. Robertson D. L.; Joyce G. F. Nature 1990, 344, 467.

For a general review, see:

-   d. Wilson D. S.; Szostak J. W. Annu. Rev. Biochem. 1999, 68, 611.-   2. Recent reviews of evolved DNA/RNA aptamers:-   a. Shamah S. M.; Healy J. M.; Cload S. T. Acc. Chem. Res. 2008, 41,    130.-   b. Gopinath S. H. B. Anal. Bioanal. Chem. 2007, 387, 171.-   3. In vitro selections of DNA-linked small molecules:-   a. Wrenn S. J.; Weisinger R. M.; Halpin D. R.; Harbury P. B. J. Am.    Chem. Soc. 2007, 129, 13137.-   b. Melkko S.; Zhang Y.; Dumelin C. E.; Scheuermann J.; Neri D.    Angew. Chem., Int. Ed. 2007, 46, 4671.

Mock Selections:

-   c. Doyon J. B.; Snyder T. M.; Liu D. R. J. Am. Chem. Soc. 2003, 125,    12372.-   d. Gartner Z. J.; Tse B. N.; Grubina R.; Doyon J. B.; Snyder T. M.;    Liu D. R. Science 2004, 305, 1601.-   4. a. Silverman S. Chem. Commun. 2008, 3467.-   b. Joyce G. F. Annu. Rev. Biochem. 2004, 73, 791.-   5. a. Kanan M. W.; Rozenman M. M.; Sakurai K.; Snyder T. M.;    Liu D. R. Nature 2004, 431, 545.-   b. Rozenman M. M.; Liu D. R. Chem Bio Chem 2006, 7, 253.-   c. Rozenman M. M.; Kanan M. W.; Liu D. R. J. Am. Chem. Soc. 2007,    129, 14933.-   6. a. Tarasow T. M.; Tarasow S. L.; Eaton B. E. Nature 1997, 389,    54.-   b. Seelig B.; Jaschke A. Chem. Biol. 1999, 6, 167.

Other tags may be used. For selected examples, see:

-   c. Chandra M.; Silverman S. K. J. Am. Chem. Soc. 2008, 130, 2936.-   d. Pradeepkumar P. I.; Hobartner C.; Baum D. A.; Silverman S. K.    Angew Chem., Int. Ed. 2008, 47, 1753.-   e. Johnston W. K.; Unrau P. J.; Lawrence M. S.; Glasner M. E.;    Bartel D. P. Science 2001, 292, 1319.-   7. a. Breaker R. R.; Joyce G. F. Chem. Biol. 1994, 1, 223.-   b. Sheppard T. L.; Ordoukhanian P.; Joyce G. F. Proc. Natl. Acad.    Sci. U.S.A. 2000, 97, 7802.

For an alternative approach, see:

-   c. Hobartner C.; Pradeepkumar P. I.; Silverman S. K. Chem. Commun.    2007, 2255.-   8. a. Rozenman M. M.; McNaughton B. R.; Liu D. R. Curr. Opin. Chem.    Biol. 2007, 11, 259.

See Also:

-   b. Wrenn S. J.; Harbury P. B. Annu. Rev. Biochem. 2007, 76, 331.-   9. Note that in contrast with a conventional selection in which    unfit library members are discarded and sequences encoding surviving    members are replicated in a subsequent step, RD-PCR achieves the    selective replication of only those sequences encoding desired    library members.-   10. Ogawa A.; Maeda M. Biorg. Med. Chem. Lett. 2007, 17, 3156.-   11. Computation carried out using the Oligonucleotide Modeling    Platform (OMP, DNA Software, Inc-   12. For examples of phosphodiester bond formation-dependent PCR,    see:-   a. Bartel D. P.; Szostak J. W. Science 1993, 261, 1411.-   b. Makrigiorgos G. M. Human Mut. 2004, 23, 406.-   c. Troutt A. B.; McHeyzer-Williams M. G.; Pulendran B.;    Nossal G. J. V. Proc. Natl. Acad. Sci. U.S.A. 1992, 89, 9823.-   13. For a general review, see: Kubista M.; Andrade J. M.; Bengtsson    M.; Forootan A.; Jonák J.; Lind K.; Sindelka R.; Sjöback R.;    Sjögreen B.; Strömbom L.; Stœhlberg A.; Zoric N. Mol. Asp. Med.    2006, 27, 95.-   14. Gartner Z. J.; Liu D. R. J. Am. Chem. Soc. 2001, 123, 6961.-   15. Heck R. F. Org. React. 1982, 27, 345. See also ref 5a.-   16. As little as 10 fmol of 9d and 9b could be carried through the    process with similar results, suggesting that selections can be    performed on a very small scale.-   17. For approaches to protease substrate profiling by derivitization    of free α-amines, see:-   a. Mahrus S.; Trinidad J. C.; Barkan D. T.; Sali A.; Burlingame A.    L.; Wells J. A. Cell 2008, 134, 866.-   b. McDonald L.; Robertson D. H. L.; Hurst J. L.; Beynon R. J. Nat.    Methods 2005, 2, 955.-   c. Gevaert K.; Goethals M.; Martens, Van Damme J.; Staes A.;    Thomas G. R.; Vandekerckhove J. Nat. Biotechnol. 2003, 21, 566.-   18. Substrate designed in consultation with Sigma-Aldrich Protease    finder (http://sigma-aldrich.com/proteasefinder).-   19. DNA-templated amide bond formation with DMT-MM:-   a. Li X. Y.; Gartner Z. J.; Tse B. N.; Liu D. R. J. Am. Chem. Soc.    2004, 126, 5090. With EDC/sNHS:-   b. Gartner Z. J.; Kanan M. W.; Liu D. R. Angew. Chem., Int. Ed.    2002, 41, 1796. For a review of DTS, see:-   c. Li X. Y.; Liu D. R. Angew. Chem., Int. Ed. 2004, 43, 4848.

Example 2 Interaction-Dependent PCR: Direct, Solution-Phase In VitroSelection for Non-Covalent Interaction

ID-PCR (FIG. 16) resembles RD-PCR except that non-covalent binding ofone or two receptor-linked DNA strands (nucleic acid template and firstprimer, respectively) to an analyte, rather than the covalent bonding ofthe two strands, allows primer extension to generate a double-strandedPCR template. Because binding of one or both DNA strands to the analyteis required for intramolecular primer extension to generate adouble-stranded PCR template, reactions in the absence of analyte do notresult in significant template formation and therefore in minimal PCRamplification. In the presence of the analyte, however, binding of theanalyte to the DNA-linked receptors brings the nucleic acid template andthe first primer into proximity. As a result, their base pairing takesplace, enabling primer extension to generate a double-stranded DNAtemplate (FIG. 17).

At least two different analyte binding schemes are provided by whichID-PCR transduces the presence of protein or small-molecule analytesinto a PCR amplicon: single-point binding (FIG. 18) and sandwich binding(FIG. 19).

In the single-point binding approach, the nucleic acid template iscovalently attached to the analyte, and the first primer is linked to ananalyte, specific aptamer or antibody receptor (FIG. 18). Aptamers offerdistinct advantages over antibodies in that they are evolved byresearchers in the laboratory (as opposed to being selected in theimmune system of an animal) and thus can be simultaneously evolvedpositively for binding to a target analyte and negatively againstbinding to specific false positive molecules. Aptamers are also less“sticky” (hydrophobic) than antibodies and thus may be less prone tonon-specific binding. The covalent linkage between the nucleic acidtemplate and a protein analyte can be established using any of severalDNA-linked reactive groups including esters and alpha-halo carbonylsthat are known to undergo facile reaction with nucleophilic groups atthe surface of virtually all proteins. Compared with the sandwichbinding method below, this approach offers the advantage that only asingle analyte-binding receptor is required. Because the protein will becovalently attached to the nucleic acid template through a variety ofregiochemistries and orientations, this approach also maximizes thelikelihood that at least one of the protein attachment schemes willallow the key primer extension event that generates the double-strandedPCR template. The single-point binding strategy may not offer maximalsensitivity, however, for this same reason; because the protein ispresented to the receptor in a heterogeneous mixture of orientations,some of the bound proteins may not support efficient primer extension.In these cases, a variety of DNA-protein linker lengths and compositionscan be tested. The single-point binding approach can further be appliedto small-molecule analytes that possess groups that are capable of beingattached to DNA-linked nucleophiles or electrophiles.

In the sandwich binding approach, both the nucleic acid template and thefirst primer are linked to a receptor (for example, an aptamer orantibody) that binds to an analyte. Upon binding of the analyte to bothreceptors, the intramolecular hairpin is formed and primer hybridizationand extension can take place. Compared with the single-point bindingstrategy, this approach requires a second receptor that is capable ofbinding the analyte simultaneously with the first receptor, but offersthe advantages of (i) not requiring covalent attachment to one of theDNA strands; (ii) not being subject to sensitivity losses that mightarise from heterogeneous DNA-bound analyte orientations, (iii) offeringgreater specificity since both receptors must simultaneously engage thetarget in order to initiate primer extension and PCR; and (iv) enabling,in principle, a single analyte molecule to give rise to multipledouble-stranded PCR template molecules under conditions in whichreceptor-analyte binding is reversible and occurs concurrently withprimer extension.

Importantly, in both approaches the binding affinities required betweenthe receptor(s) and the analyte are lower than those required intraditional intermolecular binding platforms because in both cases theanalyte and receptor(s) bind cooperatively. In the first case,receptor-analyte binding and DNA hybridization are cooperative. Ourprevious studies on DNA-templated synthesis (see Li, X.; Liu, D. R.Angew. Chem. Int. Ed. 43, 4848-4870 (2004) for a review) suggest thatthis increase in effective molarity can be quite substantial (e.g.,˜10³-10⁶-fold). Likewise, in the sandwich binding mode two receptors canboth simultaneously bind the analyte while forming a base-pairedcomplex. As a result, even modest affinity receptors may be suitable forthe sensing platform proposed here.

PCR and Readout of PCR Product Formation

Once primer extension generates a viable double-stranded PCR template inan analyte-dependent manner, standard PCR methods are used toexponentially amplify the template into many double-stranded DNAamplicons. PCR amplification is routinely used to amplify small numbers(≦˜10,000) of molecules into more than billions of double-strandedcopies. Palm-sized, battery powered PCR thermocyclers are commerciallyavailable and similar devices have already been integrated into manyportable PCR-based nucleic acid sensing devices that are in use in thefield. The presence and the abundance of the resulting PCR products canbe read out in a portable, field-deployable way by using one of severalcommercially available double-strand-specific fluorescent DNA-bindingdyes and a commercially available, hand-held fluorimeter. Only those PCRreactions in which product is generated will result in an increase insample fluorescence, and the intensity of this fluorescence shouldcorrelate with the amount of starting template and therefore should be afunction of the amount of the analyte and the efficiency of thetransduction process. Standard curves can be generated for each analyteof interest to establish a relationship between absolute fluorescenceintensity and absolute quantity of the analyte entering the ID-PCRprocess.

Operational Considerations

The ID-PCR-based detection platform described herein offers severalpotential operation advantages over existing systems. Since, in someembodiments, the analyte binding and primer extension events can takeplace in the same solution as the subsequent PCR reaction, a singlesmall tube containing buffer, the DNA-linked receptors, primer extensionreagents, double-stranded DNA fluorescent dye, and a wax capsulecontaining PCR reagents (which melts above ˜80° C. during the meltingstep of the first PCR cycle, releasing its content) can be used for theentire assay. This one-pot assay format, featured in some embodiments ofthe invention, is operationally simple because it would require onlythat the operator add the analyte to a single tube and place the tubeinto the thermocycler. Since real-time PCR methods are already widelyused have demonstrated the compatibility of double-stranded DNAfluorescent dyes with PCR reactions, some embodiments are envisioned, inwhich fluorimetry of the sample after PCR takes place in the same tube,and, in certain embodiments, even without moving the tube, for examplein embodiments using a fluorimeter that is integrated into the portabledevice (as it is in a modern real-time PCR instruments). In suchstreamlined workflow embodiments, the sensing platform requiresconsiderably less sample processing and fewer manipulations than most ofthe currently used protein and small-molecule sensing platforms. Becausethe PCR reactions used in some embodiments are not limited to DNAsequences determined by nature but instead amplify DNA sequences chosenby the researcher, their length and sequence can be optimized for rapidand highly efficient PCR amplification. For short sequences withtailor-made primer sequences, PCR amplification can currently beperformed in 6 minutes (30 cycles, 6-second thermal ramp times between95° C. and 70° C., no hold times necessary for short amplicons). If ramptimes, which depend on the thermal inertia of the sample and instrumentstage, as well as the effectiveness of the heating and cooling process,can be halved by doubling temperature change rates to ˜8° C. per secondthen the PCR step would require only 3 minutes. The fluorescent dyeaddition and fluorimetry requires only seconds. Likewise, it isenvisioned that the primer extension step to create the shortdouble-stranded PCR template requires less than one minute. Thus theentire process from addition of the analyte to the ID-PCR reactionthrough PCR and fluorescence readout in principle can be accomplished inless than 10 minutes. In some embodiments, a handheld device is employedfor thermocycling and fluorescence measurement. In some embodiments, anintegrated handheld device is employed for both functions. Handheld,battery powered devices for thermocycling and fluorescence measurementare well known in the art and are commercially available, (see, forexample, www.ahrambio.com/product.html andwww.turnerbiosystems.com/instruments/PicoFluor-handheld-fluorometer-fluorimeter-DNA-RNA-protein.php).

Although the description of the platform thus far has focused on thedetection of a single analyte, because DNA hybridization issequence-specific and the DNA sequences linked to the receptors arechosen by researchers, it is in principle possible to implement thesensing system in a multiplexed format in which a single tube containsseveral sets of DNA-linked receptors, each of which bind to a differentanalyte of interest. In this case the readout system cannot simply relyon the presence of double-stranded DNA produced during PCR but insteadmust characterize the abundance of different sequences of DNA that areeach produced in response to the presence of a different analyte.Several sequence-specific multiplex DNA readout systems are in currentuse, including Luminex bead-based systems and DNA microarray (“DNAchip”)-based systems. These multiplexed DNA readout systems are morecomplex and more difficult to implement in a portable form; however, ifsingle-sample multiplexing is highly desired, they may serve as areasonable readout method. Alternatively, the operational simplicity ofthe sensing platform may make practical a simpler form of multiplexingin which one sample is simply added to several different tubes, each ofwhich detects the presence of a different analyte. The tubes areprocessed in parallel and the abundance of PCR product is measured ineach tube (a process that takes only small numbers of seconds) byfluorimetry.

REFERENCES

-   1. Jeffreys, A. J., Neumann, R. & Wilson, V. Repeat unit sequence    variation in minisatellites: a novel source of DNA polymorphism for    studying variation and mutation by single molecule analysis. Cell    60, 473-485 (1990).-   2. Agrawal, N., Hassan, Y. A. & Ugaz, V. M. A pocket-sized    convective PCR thermocycler. Angew Chem Int Ed Engl 46, 4316-4319,    doi:10.1002/anie.200700306 (2007).-   3. Joshi, U. M. R., Romi; Sheth, Anil, R.; Shah, Haresh P.; A Simple    and Sensitive ColorTest for the Detection of Human Chorionic    Gonadotropin. Obstetrics & Gynecology 57, 252-254 (1981).-   4. Suebert, P. V., C; Esch, F; Lee, M; Dovey, H; Davis, D; Sinha, S;    Schlossmacher, M; Whaley, J; Swindlehurst, C; McCormack, R; Wolfert,    R; Selkoe, D; Lieberburg, I; Schenk, D Isolation and Quantification    of Soluble Alzheimers Beta-Peptide from Biological Fluids. Nature    359, 325-327 (1992).-   5. Vignali, D. A. Multiplexed particle-based flow cytometric assays.    J Immunol Methods 243, 243-255 (2000).-   6. Hill, H. D. & Mirkin, C. A. The bio-barcode assay for the    detection of protein and nucleic acid targets using DTT-induced    ligand exchange. Nat Protoc 1, 324-336, doi:10.1038/nprot.2006.51    (2006).-   7. Baker, B. R. et al. An electronic, aptamer-based small-molecule    sensor for the rapid, label-free detection of cocaine in adulterated    samples and biological fluids. J Am Chem Soc 128, 3138-3139,    doi:10.1021/ja056957p (2006).-   8. Hill, H. D., Vega, R. A. & Mirkin, C. A. Nonenzymatic detection    of bacterial genomic DNA using the bio bar code assay. Anal Chem 79,    9218-9223, doi:10.1021/ac701626y (2007).-   9. Sharon, E., Freeman, R., Tel-Vered, R. & Willner, I. Impedimetric    or Ion-Sensitive Field-Effect Transistor (ISFET) Aptasensors Based    on the Self-Assembly of Au Nanoparticle-Functionalized    Supramolecular Aptamer Nanostructures. Electroanal 21, 1291-1296,    doi:10.1002/elan.200804565 (2009).-   10. Swensen, J. S. et al. Continuous, real-time monitoring of    cocaine in undiluted blood serum via a microfluidic, electrochemical    aptamer-based sensor. J Am Chem Soc 131, 4262-4266,    doi:10.1021/ja806531z (2009).-   11. Schwake, M., Jentsch, T. J. & Friedrich, T. A carboxy-terminal    domain determines the subunit specificity of KCNQ K+ channel    assembly. EMBO Rep 4, 76-81, doi:10.1038/sj.embor.embor715 (2003).-   12.    www.piercenet.com/Objects/View.cfm?type=ProductFamily&ID=01041107-   13. Rose, A., Zhu, Z., Madigan, C. F., Swager, T. M. & Bulović, V.    Sensitivity gains in chemosensing by lasing action in organic    polymers. Nature 434, 876-879, doi:10.1038/nature03438 (2005).-   14. Evans, A. M. D., Corey, D; Barrett, Tom; Mitchell, Matt;    Milgram, Eric; Integrated, Nontargeted Ultrahigh Performance Liquid    Chromotogarphy/Electrospray Ionization Tandem Mass Spectrometry    Platform for the Identification and Relative Quantification of the    Small-Molecule Complement of Biological Systems. Anal Chem 81,    6656-6667 (2009).-   15. Gorin, D. J., Kamlet, A. S. & Liu, D. R. Reactivity-Dependent    PCR: Direct, Solution-Phase in Vitro Selection for Bond Formation. J    Am Chem Soc 131, 9189-+, doi:10.1021/ja903084a (2009).-   16. Ogawa, A. M., Mizuo. Aptazyme-based riboswitches as label-free    and detector-free sensors for cofactors. Bioorganic & Medicinal    Chemistry Letters 17, 3156-3160 (2007).-   17. Computation carried out using the Oligonucleotide Modeling    Platform (OMP, DNA Software, Inc.)-   18. Commercially available (Bio-Rad, Invitrogen) real-time PCR    reagents use SYBR Green I, a highly specific, double-stranded DNA    binding fluorophore. www.bio-rad.com,    www.invitrogen.com/site/us/en/home/Products-and-Services/Applications/Nucleic-Acid-Amplification-and-Expression-Profiling/qRT-PCR/Real-Time_PCR-Misc/TaqManvs-SYBR-Green-Chemistries.html

Example 3 Interaction-Dependent PCR: Identification of Ligand-TargetPairs from Libraries of Ligands and Libraries of Targets in aSingle-Solution-Phase Experiment

ID-PCR is based on the melting temperature (T_(m)) difference betweenduplex DNA formed intramolecularly versus duplex DNA formedintermolecularly. We hypothesized that binding of a target to its ligandwould increase the effective molarity of single-stranded DNA (ssDNA)oligonucleotides linked to the target and ligand, promoting duplexformation between complementary regions on each strand that areotherwise too short to hybridize (FIG. 20a ). The resulting hairpincould serve as starting point for primer extension. Only the newlyextended hairpin contains in a single DNA strand two primer (orprimer-binding) sequences that enable subsequent PCR amplification.ID-PCR therefore results in the selective amplification of those DNAsequences previously attached to, and therefore encoding, ligand-targetpairs (FIG. 20a ). In contrast to traditional target-based selections,which rely on the physical separation of active molecules from inactiveones, ID-PCR selectively amplifies DNA encoding active library members.ID-PCR can be applied to a wide variety of targets and potential ligandsand to our knowledge is one of the first methods that can identifyligand-target pairs from libraries of small molecules and libraries oftargets in a single solution.¹⁴ The nucleic acid-linked (e.g.,DNA-linked) ligands required by ID-PCR can be prepared by any suitablemethod known to those of skill in the art, including, but not limited tothe methods described in the references listed under 12 and 15 in thereference section of this example. Similarly, those of skill in the artwill readily envision other methods suitable to generate the nucleicacid-linked (e.g., DNA-linked) targets in addition to the simplenon-specific conjugation methods described here. The invention is notlimited in this respect.

In order for hairpin extension to report target-ligand binding, it mustoccur under conditions that both allow binding to take place and enableselective extension of intramolecular duplexes over intermolecularduplexes. Studies on DNA polymerase-mediated extension at 37° C.suggested that a 6-nt complementary region was optimal for enablingintramolecular but not intermolecular duplexes to be extended (FIG. 23).Complementary overlap regions 10 nt and 8 nt in length were efficientlyextended intermolecularly, while analogous intermolecular constructswith 6 nt and 4 nt complementary regions were extended poorly or not atall (a and b). Incorporating the 6-nt and 4-nt complementary regionsinto intramolecular constructs, however, dramatically increased theirextension efficiency compared with the corresponding intermolecularconstructs (c and d). These results suggest that complementary regionsshorter than 8 nt are best suited to benefit from the effective molarityincrease caused by target-ligand binding.

FIG. 23 shows optimization of complementary region length. For hairpinextension to effectively report target-ligand binding, it must occurunder conditions that both allow binding to take place and selectivelyextend intramolecular duplexes over intermolecular duplexes. We testedcomplementary regions of varying lengths to determine which can supportthe extension of intramolecular but not intermolecular duplexes. Avariant of the primer extension protocol described herein was performed,such that an extension mixture consisting of 10×NEB Buffer 2 (1 μL),dNTPs (330 pmol each in 1 μL water), target strand or hairpin (10 pmolin 1 μL water), and 5.5 μL water was prepared. After incubating thereaction at 37° C. for 2 minutes, the ligand strand (10 pmol in 1 μLwater) was added. For samples using hairpin oligonucleotides, 1 μL waterwas added instead. After incubating at 37° C. for 2 minutes, Klenow (2.5U in 0.5 μL) was added (final reaction volume=10 μL). The reaction wasincubated at 37° C. for 20 minutes and the enzyme was inactivated byheating to 75° C. for 20 minutes. The primer extension reaction wasanalyzed by PAGE (15% TBE-Urea, 200V, 20 minutes, stained with SYBRI,imaged on a Chemilmager).

We investigated whether binding between a small molecule and a proteincould replace a covalent linkage in a DNA hairpin and support extensionand PCR. Biotin and streptavidin (SA) (K_(d)=40 pM)⁴ were chosen as aninitial ligand-target pair. We reacted SA with NHS ester-linked ssDNA 1a(the target strand) to generate 1a-SA and also synthesized anoligonucleotide (the ligand strand) conjugated at its 3′ end with biotinto provide 2a-biotin. The target and ligand strands shared a 6-ntcomplementary region. Negative control ligand-DNA conjugates lackingbiotin (2) or incapable of hybridizing to 1 (3a-biotin) were alsoprepared. Each ligand-DNA conjugate (2, 2a-biotin, or 3a-biotin) wasindividually incubated under identical conditions with 1a-SA and Klenowfragment DNA polymerase at 37° C. and then subjected to qPCR todetermine the threshold cycle (C_(T)) value.

Consistent with our hypothesis, the sample containing 1a-SA and2a-biotin underwent far more efficient PCR amplification than all of thenegative controls, resulting in C_(T) values ≧5 cycles lower(corresponding to ≧32-fold more extension product) than those of the1a-SA+2 control, the 1a-SA+3a-biotin control, or a sample containing1a-SA+2a-biotin but lacking Klenow (FIG. 20b ). A positive control,containing a 10-nt complementary region that hybridizes to 2 independentof target-ligand binding, exhibited a comparable C_(T) value to that of1a-SA+2a-biotin sample. Importantly, the addition of excess free biotinabrogated the ID-PCR of 2a-biotin with 1a-SA (FIG. 20b ). ID-PCR wassurprisingly tolerant of ligand-DNA linker lengths between ˜28 and 123atoms (FIG. 24). Together, these results indicate that specificligand-target binding can promote DNA extension and trigger theselective PCR amplification of DNA sequences linked to ligand-targetpairs.

FIG. 24 shows the effect of ligand-oligonucleotide linker length onID-PCR efficiency. Because target-ligand binding involves complexes oflarge and variable size that might influence complementary regionhybridization in unanticipated ways, we sought to understand thestructural requirements for successful ID-PCR. We performed ID-PCR on1a-SA and 2a-biotin conjugates, while varying the length of the linkerbetween the biotin group and the DNA oligonucleotide from 4 to 34polyethyleneglycol (PEG) units. Primer extension and qPCR reactions wereperformed as described above (SI-9 and SI-10). For streptavidin (1a-SA),the following ligand strand sequences were used: 2-biotin-17,2-biotin-38, 2a-biotin, 2-biotin-80, 2-biotin-101, and 2-biotin-123. Fortrypsin (1j-trypsin), the following ligand strand sequences were used:2-antipain-7, 2-antipain-28, 2c-antipain, 2-antipain-70, 2-antipain-91,and 2-antipain-113. For carbonic anhydrase (1c-CA), the following ligandstrand sequences were used: 2-GLCBS-7, 2-GLCBS-28, 2e-GLCBS, 2-GLCBS-70,and 2-GLCBS-91.

As linker length was increased from 4 to 16 PEG units (7 to 59 atoms),we observed an increase in ΔC_(T) values upon ligand-receptor bindingtowards the positive control value of 8 cycles, suggesting that theDNA-ligand linker must be sufficiently long to simultaneouslyaccommodate ligand-target binding, primer-template hybridization, andDNA polymerase binding. As the length of the PEG linker was furtherincreased from 16 to 34 PEG units, ΔC_(T) values slowly decreased,consistent with the expected decrease in effective molarity of theprimer and template as the linker is further lengthened. Similar resultswere obtained from analogous experiments performed with 1b-trypsin and2c-antipain sequences and with 1c-CA and 2e-GLCBS sequences.Collectively, these results demonstrate that a single linker structureis appropriate for the three targets investigated here and suggest thatID-PCR will be able to accommodate targets of varying size and relativeorientation.

Next we tested the ability of ID-PCR to report ligand-targetinteractions of much lower affinity (K_(d)=˜2 nM to ˜3 μM; proteintarget affinities of DNA-linked ligands other than biotin were measuredand found to be within 5-fold of the reported affinities for the freesmall molecules (FIG. 25)). Ligand strand 2 was conjugated to a loweraffinity SA ligand, desthiobiotin (K_(d)=2 nM).^(5,6) When2b-desthiobiotin was combined with 1a-SA and subjected to Klenowextension and qPCR, 2b-desthiobiotin was amplified with efficiencycomparable to that of 2a-biotin (FIG. 20b ). Similarly, trypsin andantipain (K_(i)=100 nM)⁷ were conjugated to DNA to form 1b-trypsin and2c-antipain The 1b-trypsin+2c-antipain pair resulted in a seven-cycleC_(T) advantage relative to negative controls lacking ligand, containing2 conjugated to unrelated ligands, or containing an excess of freeantipain (FIGS. 20c , 24, and 25).

FIG. 25 shows affinities of ligands and ligand-DNA conjugates forprotein targets. The binding, annealing, and extension steps of ID-PCRwere performed at 37° C., and for this reason, the affinity measurementsfor carbonic anhydrase and trypsin were taken at this temperature. (a)The affinity of 2d-CBS for carbonic anhydrase is 1.50 μM, and isapproximately 5-fold higher than the affinity of CBS for CA (7.53 μM).This difference is likely due to the presence of the hydrophobichexylamine at the 3′ end of the oligonucleotide, rather than an effectof the oligonucleotide itself; it is well known that derivatives of CBScontaining hydrophobic groups at this position have increased affinitiesfor carbonic anhydrase.¹⁸ (b) As previously reported,¹⁹ addition of ahydrophobic dipeptide (gly-leu) to CBS increased its affinity for CA(GLCBS K_(i)=470 nM). Conjugation of GLCBS to DNA did not greatly affectits affinity for CA (2s-GLCBS=380 nM). (c) The apparent K_(d) of2b-desthiobiotin for streptavidin is 8 nM. (d) Antipain and 2c-antipainhave similar affinities for trypsin (K_(i)=39.5 nM and K_(i)=21.7 nM,respectively).

DNA encoding carbonic anhydrase II (CA) (1c-CA) and CA ligands 4-carboxybenzene sulfonamide (CBS) (K_(d)=3.2 μM)⁸ (2d-CBS) and Gly-Leu-CBS(K_(d)=9 nM)⁹ (2e-GLCBS) was amplified far more efficiently (six- toseven-cycle ΔC_(T)) than control reactions lacking ligand (2),containing a mismatched complementary region (3b-CBS), or containing anexcess of free GLCBS (FIGS. 20d , 24, and 25). Collectively, theseresults suggest that ID-PCR can serve as a general method to detect awide variety of small molecule-protein interactions of varyingaffinities.

Any class of intermolecular interaction can be detected by ID-PCR. Nextwe tested the ability of ID-PCR to selectively amplify DNA sequencesencoding nucleic acid aptamer-ligand pairs.¹⁰ A 68-nt DNA aptamer thatbinds daunomycin (K_(d)=272 nM)¹¹ and doxorubicin was synthesized at the5′ end of the target strand to generate 1d-aptamer. Daunomycin ordoxorubicin was conjugated to the ligand strand to afford 2g-Dn or2h-Dx, respectively. Consistent with the above results for proteintargets, ID-PCR reactions containing both 1d-aptamer and either 2g-Dn or2h-Dx were amplified more efficiently than samples with 2f in place of2g-Dn (ΔC_(T)=8 cycles), or samples containing free doxorubicin(ΔC_(T)=4 cycles) (FIG. 20e ). These results indicate that ID-PCR can beused to selectively amplify DNA linked to small molecule-aptamer pairs.

We wondered if the covalent bond between the target and the targetoligonucleotide could be replaced by a non-covalent interaction,resulting in ID-PCR of a ternary complex between two DNA-linked ligandsand a multivalent target.^(5,12,13) To test this possibility, weconjugated biotin to the target strand to generate 1e-biotin, which canhybridize with 2a-biotin. We hypothesized that in the presence of SA aternary complex of SA and two DNA-linked biotin ligands would form,enabling DNA hybridization, extension, and amplification. Indeed, ID-PCRin this “sandwich” mode detected as little as 2×10⁻¹⁹ moles (200zeptomoles) SA (FIG. 26). These results suggest the potential of ID-PCRfor the sensitive detection of multivalent analytes in sandwich assays.

FIG. 26 exemplifies the detection of multivalent analytes by ID-PCR.FIG. 26(a) shows a scheme for detection of multivalent analytes usingID-PCR. A non-covalent interaction can replace the covalent bond betweenthe target and the target strand. In the presence of the analyte, aternary complex comprising the analyte and two ligands forms, promotinghybridization of the complementary regions on the ligand-linkedoligonucleotides and resulting in increased amplification efficiency ofthe DNA. (b) 200 fmol of 1e-biotin or 1 and 200 fmol of 2a-biotin or 2were diluted in 14 μL of 1.5×NEB Buffer 2. After incubation at 94° C.for 5 minutes followed by 5 minutes at 37° C., 1 μL of H₂O with orwithout 100 fmol of streptavidin was added. Following incubation at 37°C. for 15 minutes, 2.5 U of Klenow Fragment exo⁻ in 5 μL H₂Opre-equilibrated at 37° C. were added. The primer extension reaction wasincubated at 37° C. for 15 minutes. The polymerase was inactivated byheating at 75° C. for 20 minutes. The extension reaction was subjectedto qPCR as described above. Samples containing 1e-biotin, 2a-biotin, andstreptavidin are amplified more efficiently than samples lackingstreptavidin, or lacking one of the biotin moieties (ΔC_(T)=6-7 cycles).(c) For the detection of sub-attomole quantities of streptavidin aslight variation of the protocol was applied: 200 amol of 1e-biotin and2a-biotin were incubated with 200, 20, 2, 0.2, or 0.02 amol ofstreptavidin, followed by addition of Klenow as above. The primerextension mixture was not diluted for qPCR analysis. In this sandwichmode ID-PCR can be used to detect as little as 2×10⁻¹⁹ moles (200zeptomoles of streptavidin).

Since applications of ID-PCR include library screening, we performed aseries of model selections to test the ability of ID-PCR to selectivelyenrich DNA encoding authentic ligands in the presence of an excess ofnon-binding small molecule-DNA conjugates (FIG. 21a ). A 1:10, 1:100 or1:1000 ratio of 2i-biotin:2k-GLCBS was combined with 1a-SA and subjectedto Klenow extension and PCR. The same mixtures of 2i-biotin:2k-GLCBSwere also subjected to ID-PCR in the presence of 1 (without SA) as acontrol. When ID-PCR was performed with 1a-SA, the biotin-encodingsequence 2i was strongly enriched among the resulting PCR products (FIG.21b ). In contrast, ID-PCR with 1 resulted in no enrichment of thebiotin-linked strand. Similarly strong enrichment was observed forCA-GLCBS binding and for DNA aptamer-daunomycin binding in the presenceof large excesses of non-binding conjugates (FIG. 21c-e ). Thesefindings demonstrate that ID-PCR can enrich DNA encoding a ligand˜100-fold over DNA encoding small molecules without target affinity fora variety of protein and nucleic acid targets.

Finally, we tested the ability of ID-PCR to simultaneously evaluate allpossible combinations of multiple ligands and multiple targets in asingle solution. We performed a model selection in which fivesmall-molecule ligands (biotin, desthiobiotin, GLCBS, CBS, and antipain)and three targets (SA, CA, and trypsin), each conjugated to uniquesequence tags, were present in one solution containing a 250-fold excessof DNA-linked ligand (hexylamine) and a 250-fold excess of a DNA-linkedtarget (glutathione S-transferase) not known to interact with the otherligands or targets. The negative control ligand and target wereconjugated to libraries of 256 different sequence tags. The resultingsolution therefore contained equimolar quantities of each of 261 ligandsequences and each of 259 target sequences, collectively representing67,599 possible ligand-target sequence combinations. A control samplewas prepared identically except using DNA not conjugated to any proteintargets. Both samples were subjected to ID-PCR followed byhigh-throughput DNA sequencing.

For each of the three different proteins in the library, the most highlyenriched sequences relative to the control sample correspond to theirknown ligands (FIG. 22), despite the large excess of non-binding ligandsand the fact that ligand-target affinities span five orders ofmagnitude. The mean enrichment factor across all 67,599 possibilitieswas 1.4, while the enrichment factors corresponding to the five knownligand-target pairs ranged from 75 to 3,000. Only three enrichmentfactors above 75 were observed among presumed non-binding pairs out of67,594 possibilities, representing a low false positive rate (Tables 2and 3). These results establish the ability of ID-PCR to evaluate asmall-molecule library for affinity to a protein target library in asingle experiment, and suggest that ID-PCR can identify ligand-targetpairs across a wide range of affinities in a highly multiplexed format.

Materials and Methods

General Methods.

All chemical reagents were purchased from Sigma-Aldrich, unlessotherwise noted. Water was purified with a Milli-Q purification system.DNA oligonucleotides were synthesized on a PerSeptive BiosystemsExpedite 8909 DNA synthesizer or purchased from Integrated DNATechnologies. All reagents and phosphoramidites for DNA synthesis werepurchased from Glen Research. All oligonucleotides were synthesized anddeprotected according to manufacturer's protocols. Oligonucleotides werepurified by reverse-phase high-pressure liquid chromatography (HPLC,Agilent 1200) using a C18 stationary phase (Eclipse-XDB C18, 5 μm,9.4×200 mm) and an acetonitrile/100 mM triethylammonium acetategradient. Oligonucleotide concentrations were quantitated by UVspectroscopy using a Nanodrop ND1000 spectrophotometer. Non-commercialoligonucleotides were characterized by LC/ESI-MS; reverse-phaseseparation was performed on an Alliance 2695 (Waters) HPLC system usinga UPLC BEH C18 column (1.7 μm, 2.1×50 mm) stationary phase and 6 mMaqueous triethylammonium bicarbonate/MeOH mobile phase interfaced to aQ-Tof Micro mass spectrometer (Waters). Oligonucleotides greater than 70nt in length were analyzed by PAGE.

DNA Sequences

In the sequences below: <3>=biotinTEG phosphoramidite (Glen Research,10-1955); <4>=3′ Thiol Modifier C6 S-S (20-2936); <5>=3′biotinTEG(20-2955); <6>=3′desthiobiotinTEG (20-2952); <7>=Spacer 18 (10-1918);<8>=5′CarboxyC10 (10-1935); <9>=3′ aminoC6 (20-2956); <0>=Cy3 (10-5913)and N is an equimolar solution of A, T, C, and G phosphoramidites.Underlined sequences are barcodes or sequences recognized by restrictionendonucleases. Italicized portions of primer sequences are adaptorsequences required for Illumina sequencing (©2007-2009. Illumina, Inc.All rights reserved). SEQ ID NOs are given in parentheses.

1, 1b, 1c:  5′-<8><7>CGGCGATCGTGAAGGAGGCTAGACTGAGTGAG-3′ (51) 1a: 5′-<8><7><0>CGGCGATCGTGAAGGAGGCTAGACTGAGTGAG-3′ (52) positive control: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACTCAGT<9>-3′ (53) 2: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><9>-3′ (54) 2a-biotin:5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><5>-3′ (55)3a-biotin:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATTACTAG<7><7><5>-3′(56) 2b-desthiobiotin:5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><6>-3′ (57)2c-antipain, 2d-CBS, 2e-GLCBS:5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><9>-ligand-3′ (58)3b-CBS:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATTACTAG<7><7><9>-CBS-3′(59) 1d-aptamer: 5′-ACCATCTGTGTAAGGGGTAAGGGGTGGGGGTGGGTACGTCTCGGCGATCGTGAAGGAGACT GAGTGAG-3′(60) 2f:  5′-GGATTCTGTATGACTGTCCCACGTATCTCACT<7><4>-3′ (61)2g-Dn, 2h-Dx: 5′-GGATTCTGTATGACTGTCCCACGTATCTCACT<7><4>-linker-ligand-3′ (62)2i-biotin:  5′-TGGATCGTGATGACTGTCCCGACAAGAATTCGTATCTCACT<7><7><5>-3′(63) 2k-GLCBS: 5′-TGGATCGTGATGACTGTCCCGACAAGCTTACGTATCTCACT<7><7><9>-GLCBS-3′ (64)1f-aptamer: 5′-ACCATCTGTGTAAGGGGTAAGGGGTGGGGGTGGGTACGTCTCGGCGATCGTGAAGGAGTAAGCTACTGAGTGAG-3′ (65) 1g-aptamer: 5′-ACCATCTGTGTAAGGGGTAAGGGGTGGGGGTGGGTACGTCTCGGCGATCGTGAAGGAGATGCATACTGAGTGAG-3′ (66) 1h-unstructured DNA: 5′-ATTGATCACTTGATTTCTGCCCATTGATTAAAGTCGCAAGTCGGCGATCGTGAAGGAGTAAGCTACTGAGTGAG-3′ (67) 21: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><4>-3′ (68) 2m: 5′-TGGATCGTGATGACTGTCCCGACAAATGCATGTATCTCACT<7><4>-3′ (69) 2n-Dn: 5′-TGGATCGTGATGACTGTCCCGACAAATGCATGTATCTCACT<7><4>-linker-Dn-3′ (70)1i:  5′-<8><7>CGGCGATCGTGAAGGAGGCTCGAGTGAGTGAG-3′ (71) 1k: 5′-<8><7>CGGCGATCGTGAAGGAGGCTAGCCTGAGTGAG-3′ (72) 1j: 5′-<8><7>CGGCGATCGTGAAGGAGGANNNNTTGAGTGAG-3′ (73) 2o-CBS: 5′-TGGATCGTGATGACTGTCCCGACAAGCTTACGTATCTCACT<7><7><9>-CBS-3′ (74) 2p: 5′-TGGATCGTGATGACTGTCCCGACAACCATGGGTATCTCACT<7><7><9>-3′ (75)2q-desthiobiotin:5′-TGGATCGTGATGACTGTCCCGACAACCATGGGTATCTCACT<7><7><9>-desthloblotin-3′(76) 2r: 5′-TGGATCGTGATGACTGTCCCGACAAGGATCCGTATCTCACT<7><7><9>-3′ (77)2s-GLCBS: 5′-TGGATCGTGATGACTGTCCCGACAAGGATCCGTATCTCACT<7><7><9>-GLCBS-3′ (78) 2t: 5′-TGGATCGTGATGACTGTCCCGACAAAGTACTGTATCTCACT<7><7><9>-3′ (79)2u-antipain:5′-TGGATCGTGATGACTGTCCCGACAAAGTACTGTATCTCACT<7><7><9>-antipain-3′ (80)2v-amine:  5′-TGGATCGTGATGACTGTCCCGACAATNNNNAGTATCTCACT<7><7><9>-3′ (81)Primer A:  5′-TGGATCGTGATGACTGTCCC-3′ (82) Primer B: 5′-CGGCGATCGTGAAGGAG-3′ (83) Primer C: 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTT TCTCGCGGCGATCGTGAAGGAG-3′ (84) Primer D: 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTT CTACCCGGCGATCGTGAAGGAG-3′ (85) Primer E: 5′-CAAGCAGAAGACGGCATACGAGCTCTTCCGATCTTGGATCGTGATGACTGTCCC-3′ (86)FIG. 23: 1-4 nt overlap:  5′-<8><7>CGGGGATCGTGAAGGAGGCTAGACTG-3′ (87)1-6 nt overlap:  5′-<8><7>CGGGGATCGTGAAGGAGGCTAGACTGAG-3′ (88)1-8 nt overlap:  5′-<8><7>CGGGGATCGTGAAGGAGGCTAGACTGAGTG-3′ (89)1-10 nt overlap:  5′-<8><7>CGGGGATCGTGAAGGAGGCTAGACTGAGTGAG-3′ (90) 2: 5′-TGAGTCGTGATGACTGTCCCGACAAGCTTACGTATCTCACTCAGT<9>-3′ (91)hairpin-4 nt overlap: 5′-TGAGTCGTGATGACTGTCCCGACAAGCTTACGTATCTCACTCAG<7>CGGGGATCGTGAAGGAGGCTAGACTG-3′ (92) hairpin-6 nt overlap: 5′-TGAGTCGTGATGACTGTCCCGACAAGCTTACGTATCTCACTCAG<7>CGGGGATCGTGAAGGAGGCTAGACTGAG-3′ (93) FIG. 24: 2-biotin-17: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<5>-3′ (94) 2-biotin-38: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><5>-3′ (94) 2-biotin-80: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><5>-3′ (94)2-biotin-101:5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><5>-3′ (94)2-biotin-123: 5f-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><7><5>-3′ (94)2-amine-7:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<9>-3′ (94)2-amine-28:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><9>-3′ (94)2-amine-70:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><9>-3′(94) 2-amine-91: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><9>-3′(94)2-amine-112: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><7><9>-3′ (94)2-antipain-7: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<9>-antipain-3′ (94)2-antipain-28: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><9>-antipain-3′ (94)2-antipain-70: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><9>-antipain-3′(94) 2-antipain-91: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><9>-antipain-3′(94) 2-antipain-112: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><7><9>-antipain-3′(94) 2-GLCBS-7: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<9>-GLCBS-3′ (94)2-GLCBS-28:  5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><9>-GLCBS-3′(94) 2-GLCBS-70: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><9>-GLCBS-3′ (94)2-GLCBS-91: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><9>-GLCBS-3′(94) 2-GLCBS-112: 5′-TGGATCGTGATGACTGTCCCGACAAGCATACGTATCTCACT<7><7><7><7><7><9>-GLCBS-3′(94) FIG. 25: 1e-biotin:  5′-<3><7>CGGCGATCGTGAAGGAGGCTAGACTGAGTGAG-3′(95) FIG. 26: complementary 20mer:  5′-GGGACAGTCATCACGA<0>TCCA-3′ (96)

Synthesis of Ligand-DNA Conjugates

2a-biotin, 2b-desthiobiotin, and 3a-biotin were prepared directly duringoligonucleotide synthesis using commercially available phosphoramidites.2c, 2d, 2e, 2k, 2o, 2q, 2s, 2u, and 3c were prepared by conjugatingantipain, CBS, GLCBS, or desthiobiotin (referred to below as the“ligand”) to the corresponding 3′-amine modified oligonucleotide(sequences 2 or 3 above). To 215 μL DMSO was added the ligand (1.25 μmolin 12.5 μL DMSO), sulfo-NHS (3.33 μmol in 10 μL 2:1 DMSO:H₂O), and1-ethyl-3-(3-dimethylaminopropyl carbodiimide) (EDC, 1.2 μmol in 12 μLDMSO). After the reaction mixture (final volume=249.5 μL) was stirred atroom temperature for 30 minutes, 3′-amine modified oligonucleotide (5nmol in 10 μL H₂O) and triethylamine/HCl pH 10 (50 μL of a 500 mM stocksolution) were added. The reaction was stirred at room temperature for12 hours. Tris-HCl, pH 8.0 (20 μL of a 500 mM stock solution) was addedand the reaction mixture was incubated for 1 h at room temperature. Theproducts were purified by reverse-phase HPLC, precipitated with ethanol,and characterized by UV/Vis spectroscopy and LC/MS.

Synthesis of DNA-Daunomycin and DNA-Doxorubicin Conjugates

To 3′-dithiol-modified DNA (4 nmol in 37.5 μL 1×PBS buffer, sequence 2for 2m above) were added 7.5 μL of 500 mM pH 8 HEPES buffer and 5 μL of 1M aqueous DTT. After 30 minutes at room temperature, the DNA fractionwas isolated by size exclusion chromatography (SEC) using Centri-Sepspin columns (Princeton Separations, Inc.). To the deprotected 3′thiol-modified DNA was then added 29 μL of 3×PBS buffer supplementedwith 3 mM EDTA and the appropriate small molecule (daunomycin ordoxorubicin) (200 nmol in 20 μL of 10 mM pH 7.4 Tris-HCl buffer). Thebifunctional linker SM(PEG)₂₄ (Piercenet) was then added (100 nmol in 1μL of DMSO). After brief agitation, the reaction mixture was incubatedat room temperature for 90 minutes, and then purified by SEC using aNap-5 column (GE Healthcare). The product was further purified by HPLC,and characterized by UV/Vis spectroscopy and LCMS.

LC/MS Characterization of Oligonucleotides and Ligand-OligonucleotideConjugate

All raw data were processed using MassLynx MaxEnt1 (Waters Micromass) toobtain the deconvoluted mass using m/z 1500-5000 and the followingMaxEnt1 parameters: 5000 Da output mass range around expected mass (from5000 to 25000 Da, depending on construct); 0.1 Da output resolution;minimum intensity ratio left and right, 33%; width at half height foruniform Gaussian model, 0.75; number of iterations, 10.

Oligonucleotide Expected Mass (Da) Observed Mass (Da) 1 10614.2 10612.01a 11121.7 11119.0 positive control 13975.2 13973.4 2 13427.9 13425.52a-biotin 13818.4 13815.2 2b-desthiobiotin 13788.4 13786.0 2c-antipain14014.6 14012.7 2e-GLCBS 13771.9 13770.0 2d-CBS 13601.9 13601.83a-biotin 13882.5 13879.7 3b 13492.1 13489.3 3c-CBS 13675.1 13675.5 2f10437.0 10437.0 2g-Dn 12113.8 12112.0 2h-Dx 12129.8 12127.1 2i-biotin13833.4 13831.4 2j 13419.0 13415.8 2k-GLCBS 13771.9 13770.1 1e-biotin10933.3 10932.8 2l 13232.8 13232.7 2m 13246.8 13246.2 2n-Dn 14923.614919.7 1i 10630.1 10628.6 1j 10590.5 10589.2 1k 10629.2 10626.0 2o-CBS13601.9 13598.8 2p 13443.8 13441.8 2q-desthiobiotin 13640.2 13636.4 2r13443.9 13440.6 2s-GLCBS 13796.9 13797.3 2t 13442.9 13440.1 2u-antipain14029.6 14026.9 2v 13443.0 13441.2 2-for FigS1 13966.2 13965.0 1-4 ntoverlap 8735.9 8734.1 1-6 nt overlap 9378.4 9377.1 1-8 nt overlap10011.8 10008.4 1-10 nt overlap 10654.2 10651.4 hairpin-4 nt overlap22334.3 22331.0 hairpin-6 nt overlap 22976.3 22971.5 2-biotin-17 13129.813126.6 2-biotin-38 13474.1 13470.1 2-biotin-80 14162.7 14159.62-biotin-101 14507.0 14503.4 2-biotin-123 14851.3 14847.2 2-amine-712739.4 12740.0 2-amine-28 13083.7 13082.0 2-amine-70 13772.3 13770.02-amine-91 14116.6 14113.1 2-amine-112 14460.9 14458.6 2-antipain-713328.1 13322.0 2-antipain-28 13672.4 13667.0 2-antipain-70 14361.014355.0 2-antipain-91 14705.3 14700.9 2-antipain-112 15049.6 15044.42-GLCBS-7 13092.4 13088.9 2-GLCBS-28 13436.7 13435.2 2-GLCBS-70 14125.314124.8 2-GLCBS-91 14469.6 14466.8 complementary 20mer 6631.6 6616.1

Synthesis of Protein-DNA Conjugates

To 50 μL of sodium MES buffer (50 mM, pH 5.4) was added sNHS (1.6 mmolin 5 μL 2:1 DMSO: H₂O), EDC (500 nmol in 5 μL DMSO), and thecorresponding 5′-carboxylate modified oligonucleotide (0.8-7 nmol in22-35 μL water, sequences 1, 1a, 1b, 1c, 1i, 1k, or 1j above). Theresulting solution was incubated at room temperature for 30 minutes andthen subjected to SEC using an Illustra MicroSpin G-25 spin column (GEHealthcare). The corresponding protein (0.4-5.7 nmol protein in 5-50 μLPBS of streptavidin (New England Biolabs), bovine carbonic anhydrase II,bovine trypsin, or glutathione S-transferase (Sigma Aldrich)) was addedto a final stoichiometry of ˜0.7:1 protein:DNA. The reaction mixture wasincubated at room temperature for 4 hours, then quenched by addition ofTris-HCl pH 8.0 (10 μL of a 500 mM stock solution). The protein-DNAconjugates were purified by SEC on a Sephadex 75 10/300 column (GEHealthcare) using an AKTA FPLC (Amersham Biosciences) over 1.5 columnvolumes at a flow rate of 0.8 mL/min in 1×PBS. Fractions werecharacterized by SDS-PAGE (FIG. 27). 1a-SA was quantitated by UV-Visspectroscopy using absorption of the Cy3 chromophore.

PAGE characterization of DNA-target conjugates (FIG. 27). Protein-DNAconjugates were synthesized and fractionated as described in the Methodssection. Protein-DNA conjugates were characterized and quantitated bySDS-PAGE and densitometry. Representative gels are shown here. Aliquots(12 μL) of fractions collected from the FPLC were diluted in 4×NuPAGESample Loading Buffer (4 μL) (Invitrogen), heated to 95° C. for 5minutes and analyzed by electrophoresis (12% Bis-Tris gel, 150V, 45minutes). The gel was stained with Sypro Ruby Gel Stain and imaged on aChemilmager. Conjugation of the oligonucleotide to the protein resultsin the appearance of higher molecular weight bands after conjugation.Protein dilution series standards were used to quantitate theprotein-DNA conjugates. (a) Conjugation of 1i to CA. Fractions 9 and 10were pooled for use in ID-PCR experiments. (b) Conjugation of 1j totrypsin. Fractions 10 and 11 were pooled for use in ID-PCR experiments.(c) Conjugation of 1k to GST. Fractions 8, 9, and 10 were pooled for usein ID-PCR experiments.

TABLE 2 Enrichment factors observed for the five expected interactionsas well as the mean enrichment of all 67,599 possible combinations ofligand and target sequences. Ligand Target Enrichment Factor biotinstreptavidin 860 desthiobiotin streptavidin 3078 antipain trypsin 331GLCBS carbonic anhydrase 225 CBS carbonic anhydrase 75 mean of allenrichment factors 1.4

TABLE 3 The number of presumed false positives (out of 67,594possibilities) at a variety of enrichment factor thresholds. EnrichmentFactor Number of Presumed False Positives 10 1117 25 82 50 7 75 3 100 0

Synthesis of Gly-Leu-CBS

Carboxy benzene sulfonamide (20 μmol in 20 μL dry DMF (J.T. Baker),N-hydroxysuccinimide (22 μmol in 22 μL dry DMF), and EDC-HCl (14 μmol in140 μL dry DMF) were mixed together and stirred at room temperature for4 hours. The dipeptide H-leucine-glycine-OH (2 μmol in 20 μL dry DMF)and DIPEA (1.2 μL, 8 μmols) were added and the resulting mixture wasstirred overnight at room temperature. ESI-MS calculated for [M-H⁺]⁻:370.41 Found: 369.97. ¹H NMR (DMSO-d⁶, 500 MHz) δ 8.72 (d, 1H, J=8.5Hz), 8.30 (t, 1H, J=5.5 Hz), 8.09 (d, 2H, J=8 Hz), 7.94 (d, 2H, J=8 Hz),7.52 (s, 2H), 4.61 (m, 1H), 3.77 (m, 2H), 1.72 (m, 2H), 1.62 (m, 1H),0.95 (d, 3H, J=6.5 Hz), 0.92 (d, 3H, J=6.5 Hz). ¹³C NMR (DMSO-d⁶, 125MHz) δ 173.0, 171.8, 166.0, 147.0, 137.7, 128.9, 126.2, 52.4, 41.5,41.0, 25.1, 23.8, 22.0.

Primer Extension with Klenow Fragment

An extension mixture consisting of 10×NEB Buffer 2 (2 μL), dNTPs (660pmol each in 2 μL water), and the target strand (200 fmol in 9 μL PBS)was incubated at 37° C. for 5 minutes. The appropriate ligand strand(200 fmol in 2 μL water) was added and the reaction mixture wasincubated at 37° C. for 15 minutes. Klenow fragment exo⁻ (2.5 U in 5 μLof 1×NEB Buffer 2, New England Biolabs) was added. The primer extensionreaction (final volume=20 μL) was incubated at 37° C. for 15 minutes.The polymerase was inactivated by heating to 75° C. for 20 minutes.

Primer Extension Conditions for Aptamer Experiments

The aptamer-DNA conjugate, (1d-aptamer, 1f-aptamer, 1g-aptamer, or1h-unstructured listed above) (200 pmol), was diluted into 18 μLextension buffer (lx NEB Buffer 2 supplemented with 1 mM CaCl₂, 5 mMKCl, and 33 μM dNTPs). This solution was heated to 95° C., thengradually cooled to 37° C. and incubated for 15 minutes. If required,excess small molecule was then added and the resulting solution wasincubated for 15 minutes at 37° C. The DNA-ligand conjugate (200 pmol in2 μL) was added and the resulting solution was incubated for 15 minutesat 37° C. The extension reaction was initiated by the addition of Klenowfragment exo⁻ (1 U in 2 μL 1×NEB Buffer 2). After 15 minutes, thepolymerase was inactivated by heating to 75° C. for 20 minutes.

Quantitative PCR (qPCR) Analysis of Primer Extension Reactions

For each 25 μL qPCR reaction, 12.5 μL of 2×SYBR Green iQ Supermix(Bio-Rad) was combined with 1 μL of 10 μM Primer A, 1 μL of 10 μM PrimerB, and 9.5 μL Milli Q H₂O. Unless otherwise stated, the primer extensionproducts were diluted 1:100 into H₂O and 1 μL of this solution was addedto the mixture described above. Quantitative PCR was performed using aCFX-96 Real-Time System with a C1000 Thermal Cycler (Bio-Rad). PCRconditions: 5 min at 95° C., then 40 cycles of [30 sec at 95° C., 30 secat 50° C., 30 sec at 72° C.].

ID-PCR on Single Target-Ligand Pairs (FIGS. 20 and 21)

All primer extension and qPCR reactions were performed as describedabove, except that free ligand (200 pmol in 2 μL 10% DMSO) or DMSO (2 μLof 10% DMSO) was added when appropriate. To verify that the observedqPCR threshold cycles correlated with formation of PCR product, weperformed gel electrophoresis analyses of PCR reactions halted at thecycle threshold value of the matched ligand-target Klenow extension.Unless otherwise noted, PCR products were analyzed by PAGE on 10% TBEgels at 200 V for 25 minutes staining with ethidium bromide and imagingon a Chemilmager (AlphaInnotech).

The ligand strands, targets, and PCR conditions were as follows. Forstreptavidin (1a-SA), the following ligand strand sequences were used:positive control, 3a-biotin, 2, 2a-biotin and 2b-desthiobiotin; PCR wasperformed for 20 cycles (FIG. 20b ). For trypsin (1b-trypsin), thefollowing ligand strand sequences were used: positive control, 2,2a-biotin, 2e-GLCBS, and 2c-antipain; PCR was performed for 20 cycles(FIG. 20c ). For carbonic anhydrase (1c-CA), the following ligand strandsequences were used: positive control, 3b-CBS, 2, 2a-biotin, 2d-CBS and2e-GLCBS; PCR was performed for 24 cycles (FIG. 20d ). For the aptamer(1d-aptamer), the following sequences were used: 2f, 2g-Dn, and 2h-Dx;PCR was performed for 23 cycles (FIG. 20e ). In this case, the productswere stained with SYBRI (Invitrogen) and imaged on a Typhoon Trio Imager(Amersham).

In order to investigate the enrichment of DNA encoding ligand-targetpairs in the presence of excess non-binding ligand conjugates, a seriesof ID-PCR experiments were conducted with mixtures of binding conjugateand non-binding conjugates (see FIG. 21). These mixtures were subjectedto Klenow extension either in the presence of the target conjugate (the“selection” case) or a target strand lacking protein (the “negativecontrol” case). Primer extension reactions were performed with constantconcentrations of a non-binding ligand-DNA conjugate (10 nM) and varyingconcentrations of a binding ligand-DNA conjugate (1 nM, 100 pM, 10 pM).The appropriate cycle number for each reaction was determined by qPCRevaluation, such that preparative PCR reactions (50 μL total volume)were stopped in the exponential amplification phase, in order tominimize dynamic compression during PCR. An aliquot (16 μL) of the PCRreaction mixture was then incubated with the appropriate restrictionenzyme (FIG. 21b : EcoRI: FIG. 21c : HindIII; FIG. 21d,e : NsiI). Allrestriction enzymes were purchased from New England Biolabs. Theresulting samples were analyzed by PAGE as described above (10% TBE, 200V, 25 minutes).

FIG. 21b :

Target Binding Non-binding Ratio Strand PCR Cycles Gel Lane 2i-biotin —1:0 1a-SA 22 1 — 2k-GLCBS 0:1 1a-SA 26 2 2i-biotin 2k-GLCBS 1:10 1 32 32i-biotin 2k-GLCBS 1:10 1a-SA 22 4 2i-biotin 2k-GLCBS 1:100 1 32 52i-biotin 2k-GLCBS 1:100 1a-SA 25 6 2i-biotin 2k-GLCBS 1:1000 1 32 72i-biotin 2k-GLCBS 1:1000 1a-SA 26 8

FIG. 21c :

Target Binding Non-binding Ratio Strand PCR Cycles Gel Lane 2k-GLCBS —1:0 1c-CA 25 1 — 2i-biotin 0:1 1c-CA 32 2 2k-GLCBS 2i-biotin 1:10 1 30 32k-GLCBS 2i-biotin 1:10 1c-CA 25 4 2k-GLCBS 2i-biotin 1:100 1 30 52k-GLCBS 2i-biotin 1:100 1c-CA 28 6 2k-GLCBS 2i-biotin 1:1000 1 30 72k-GLCBS 2i-biotin 1:1000 1c-CA 31 8

FIG. 21d :

Gel Binding Non-binding Ratio Target Strand PCR Cycles Lane — 2l 0:11f-aptamer 28 1 2n-Dn — 1:0 1f-aptamer 20 2 2n-Dn 2l 1:101h-unstructured 28 3 2n-Dn 2l 1:10 1f-aptamer 24 4 2n-Dn 2l 1:1001h-unstructured 28 5 2n-Dn 2l 1:100 1f-aptamer 28 6 2n-Dn 2l 1:10001h-unstructured 28 7 2n-Dn 2l 1:1000 1f-aptamer 28 8

FIG. 21e :

Target Binding Non-binding Ratio Strand PCR Cycles Gel Lane —1h-unstructured 0:1 2g-Dn 30 1 1f-aptamer — 1:0 2g-Dn 21 2 1f-aptamer1h-unstructured 1:10 2f 30 3 1f-aptamer 1h-unstructured 1:10 2g-Dn 25 41f-aptamer 1h-unstructured 1:100 2f 30 5 1f-aptamer 1h-unstructured1:100 2g-Dn 28 6 1f-aptamer 1h-unstructured 1:1000 2f  22* 7 1f-aptamer1h-unstructured 1:1000 2g-Dn  22* 8 *indicates samples that were notdiluted 100-fold prior to qPCR

ID-PCR of a Ligand Library and Target Library in a Single Solution (FIG.22)

Primer extension was performed with ligand sequences (40 pM each2i-biotin, 2q-desthiobiotin, 2s-GLCBS, 2o-CBS, and 2u-antipain and 9.8nM 2v) and target sequences (60 pM each 1a-SA, 1i-CA and 1j-trypsin and9.8 nM 1k-GST) such that the reaction contained an equimolar quantity ofeach of 261 ligand sequences and 259 target sequences. A control primerextension reaction was also performed using the same pool of ligandstrands, but with a pool of target strands lacking conjugated proteins(1, 1i, 1j, 60 pM each and 1k, 9.8 nM). Adapter sequences required forIllumina sequencing, as well as barcodes identifying the input andselection experiments, were installed by PCR with either Primer C (inputcontrol) or Primer D (selection) and Primer E. The appropriate cyclenumber for each reaction was determined by qPCR evaluation, such thatpreparative PCR reactions (50 μL total volume) were stopped in theexponential amplification phase in order to minimize dynamic compressionduring PCR. PCR product was purified by gel (3% agarose, 200V, 20 minfollowed by Qiagen Extraction Kit), quantitated using PicoGreen(Invitrogen), and pooled in equimolar amounts (10 nM total DNA) forsequencing. Sequencing was performed on an Illumina (Solexa) GenomeAnalyzer II (FAS Center for Systems Biology, Harvard University).

Approximately 500,000 sequence reads were obtained after sequencing anddata processing using MATLAB (The MATLAB script was run on the OdysseyCluster supported by the FAS Sciences Division Research ComputingGroup.): 286,784 from the selection experiment, and 189,133 from thecontrol (no protein) experiment. Due to the large number of possiblesequences, many of the 67,599 possible sequences were not observed. Avalue of 1 was therefore added to the number of observed counts in theinput and selection datasets for every sequence. The observed sequencecounts were normalized (results from selection were divided by 286,784and results from control were divided by 189,133). Enrichment factorswere determined by dividing the fraction of total counts observed foreach sequence after selection by the fraction of total counts observedfor that sequence in the input control. The resulting set of enrichmentfactors are plotted in FIG. 21 and described in Tables 2 and 3.

FIG. 25 shows Ligand-Target Affinity Measurements. FIGS. 25 a and b showligand-target affinity measurements of a CBS, 2d-CBS, GLCBS and2s-GLCBS/carbonic anhydrase complex. Enzymatic activity was measured byadopting a previously described method:¹⁶ carbonic anhydrase (20 pmolfor CBS and 2d-CBS, 4 pmol for GLCBS and 2s-GLCBS) was diluted in 180 μLassay buffer (PBS, 5% DMSO) and incubated with various concentrations ofinhibitor for 10 minutes at room temperature and then for 5 minutes at37° C. before addition of a chromogenic substrate, 4-nitrophenyl acetate(2 mmol in 20 μL acetonitrile). The change in absorbance signal (400 nm)was measured over 10 minutes at 37° C. using a SpectraMax microplatereader. The data obtained at various inhibitor concentrations wasplotted and the K_(i) was determined by fitting the equation below.

FIG. 25 d shows ligand-target affinity measurements of a Antipain,2e-antipain/trypsin complex. Conditions were adopted from a previouslyreported protocol.¹⁷ Trypsin (200 fmol) was diluted in 90 μL PBS and wasincubated with various concentrations of antipain, a known inhibitor oftryptic proteolysis, or 2e-antipain. The solution was equilibrated atroom temperature for 10 minutes and at 35° C. for 5 minutes beforeaddition of the fluorogenic proteolytic substrate Z-Arg-AMC (Bachem) (10nmol in 10 μL 50% DMSO) (final volume=100 μL). The change influorescence signal (ex: 383 nm; em: 455 nm; cutoff filter: 420 nm) wasmeasured over 10 minutes at 35° C. using a SpectraMax microplate reader,and fitting of results obtained at various concentrations of inhibitorenabled determination of K_(i) (see equation below).

Equation Used to Fit K_(i) from Inhibitory Curves (FIGS. 25 a, b, d).

The K_(i) was determined by fitting the following equation to the plotof normalized activity vs. inhibitor concentration:V_(o)=c+d(([E]_(tot)+[i]_(tot)+K_(i))−sqrt((−[E]_(tot)−[I]_(tot)−K_(i))̂2−4*[E]_(tot)*[I]_(tot)))/(2*[E]_(tot)),where V_(o) is initial reaction velocity, [E]_(tot) is total enzymeconcentration, [I]_(rot) is total inhibitor concentration; c and d arevariable parameters for the maximum V_(o) and for the difference betweenthe minimum and maximum V_(o), respectively.

FIG. 25 c shows measurement of 2b-desthiobiotin/streptavidin complex bygel shift. The DNA-ligand conjugate 2b-desthiobiotin was hybridized to acomplementary 20-mer carrying a Cy3 label at the 3′-end. The resultingduplex (10 nM) was incubated with increasing concentrations ofstreptavidin in binding buffer (145 mM NaCl, 10 mM Tris/HCl, 10 mMMgCl₂, 7 mM Na₂PO₄, 1.9 mM KCl, 1 mM dithiothreitol, pH 7.4), for 1 hand analyzed on a 20% TBE polyacrylamide gel. The appearance of a highermolecular weight band indicated formation of astreptavidin-2b-desthiobiotin complex. The apparent K_(d) was determinedfrom the streptavidin concentration for which the bands corresponding tofree DNA and to the complex were of approximately equal intensity.

REFERENCES

-   (1) a) Inglese, J.; Johnson, R. L.; Simeonov, A.; Xia, M.; Zheng,    W.; Austin, C. P.; Auld, D. S. Nat. Chem. Biol. 2007, 3, 466. b)    Zhu, Z.; Cuozzo, J. J. Biomol. Screening 2009, 14, 1157.-   (2) Vijayendran, R. A.; Leckband, D. E. Anal. Chem. 2001, 73, 471.-   (3) Gorin, D. J.; Kamlet, A. S.; Liu, D. R. J. Am. Chem. Soc. 2009,    131, 9189.-   (4) Green, N. M. Methods in Enzymol. 1990, 184, 62.-   (5) Dumelin, C. E.; Scheuermann, J.; Melkko, S.; Neri, D.    Bioconjugate Chem. 2006, 17, 366.-   (6) Torreggiani, A.; Fini, G. Biospectroscopy 1998, 4, 197.-   (7) Otto, H. H.; Schirmeister, T. Chem. Rev. 1997, 97, 133.-   (8) West, G. M.; Tang, L.; Fitzgerald, M. C. Anal. Chem. 2008, 80,    4175.-   (9) Mincione, F.; Starnotti, M.; Menabuoni, L.; Scozzafava, A.;    Casini, A.; Supuran, C. T. Bioorg. Med. Chem. Lett. 2001, 11, 1787.-   (10) a) Ellington, A. D.; Szostak, J. W. Nature 1990, 346, 818. b)    Robertson, D. L.; Joyce, G. F. Nature 1990, 344, 467. c) Tuerk, C.;    Gold, L. Science 1990, 249, 505. For a general review see: d)    Wilson, D. S.; Szostak, J. W. Annu. Rev. in Biochem. 1999, 68, 611.-   (11) Wochner, A.; Menger, M.; Orgel, D.; Cech, B.; Rimmele, M.;    Erdmann, V. A.; Gloekler, J. Anal. Biochem. 2008, 373, 34.-   (12) Melkko, S.; Scheuermann, J.; Dumelin, C. E.; Neri, D. Nat.    Biotechnol. 2004, 22, 568-   (13) Sprinz, K. I.; Tagore, D. M.; Hamilton, A. D. Bioorg. Med.    Chem. Lett. 2005, 15, 3908.-   (14) a) Bowley, D. R.; Jones, T. M.; Burton, D. R.; Lerner, R. A. P.    Natl. Acad. Sci. U.S. Pat. No. 2,009,106, 1380. b) Fredriksson, S.;    Gullberg, M.; Jarvius, J.; Olsson, C.; Pietras, K.;    Gustafsdottir, S. M.; Ostman, A.; Landegren, U. Nat. Biotechnol.    2002, 20, 473. c) Hofstadler, S. A.; Griffey, R. H. Chem. Rev. 2001,    101, 377.-   (15) a) Clark, M. A. et al. Nat. Chem. Biol. 2009, 1. b) Mannocci,    L.; Zhang, Y.; Scheuermann, J.; Leimbacher, M.; De Bellis, G.;    Rizzi, E.; Dumelin, C. E.; Melkko, S.; Neri, D. P. Natl. Acad. Sci.    U.S. Pat. No. 2,008,105, 17670. c) Gorska, K.; Huang, K.-T.;    Chaloin, O.; Winssinger, N. Angew. Chem. Int. Ed. 2009, 48, 7695. d)    Halpin, D. R.; Harbury, P. B. PLoS Biol. 2004, 2, 1022. e)    Hansen, M. H.; Blakskjaer, P.; Petersen, L. K.; Hansen, T. H.;    Hojfeldt, J. W.; Gothelf, K. V.; Hansen, N. J. V. J. Am. Chem. Soc.    2009, 131, 1322. f) Kanan, M. W.; Rozenman, M. M.; Sakurai, K.;    Snyder, T. M.; Liu, D. R. Nature 2004, 431, 545. g) Kleiner, R. E.;    Dumelin, C. E.; Tiu, G. C.; Sakurai, K.; Liu, D. R. J. Am. Chem.    Soc. 2010, 1. h) Tse, B. N.; Snyder, T. M.; Shen, Y.; Liu, D. R. J.    Am. Chem. Soc. 2008, 130, 15611. i) Doyon, J. B.; Snyder, T. M.;    Liu, D. R. J. Am. Chem. Soc. 2003, 125, 12372. j) Gartner, Z. J.;    Liu, D. R. J. Am. Chem. Soc. 2001, 123, 6961. k) Gartner, Z. J.;    Tse, B. N.; Grubina, R.; Doyon, J. B.; Snyder, T. M.; Liu, D. R.    Science 2004, 305, 1601.1) Clark, M. A.; Acharya, R. A.;    Arico-Muendel, C. C.; Belyanskaya, S. L.; Benjamin, D. R.;    Carlson, N. R.; Centrella, P. A.; Chiu, C. H.; Creaser, S. P.;    Cuozzo, J. W.; Davie, C. P.; Ding, Y.; Franklin, G. J.; Franzen, K.    D.; Gefter, M. L.; Hale, S. P.; Hansen, N. J. V.; Israel, D. I.;    Jiang, J.; Kavarana, M. J.; Kelley, M. S.; Kollmann, C. S.; Li, F.;    Lind, K.; Mataruse, S.; Medeiros, P. F.; Messer, J. A.; Myers, P.;    O'keefe, H.; Oliff, M. C.; Rise, C. E.; Satz, A. L.; Skinner, S. R.;    Svendsen, J. L.; Tang, L.; Van Vloten, K.; Wagner, R. W.; Yao, G.;    Zhao, B.; Morgan, B. A. Nat. Chem. Biol. 2009, 5, 647.-   (16) Pocker, Y.; Stone, J. T. Biochemistry 1967, 6, 668.-   (17) Melkko, S.; Zhang, Y.; Dumelin, C. E.; Scheuermann, J.;    Neri, D. Angew. Chem. Int. Ed. 2007, 46, 4671.-   (18) Jain, A.; Whitesides, G. M.; Alexander, R. S.;    Christianson, D. W. J. Med. Chem. 1994, 37, 2100.-   (19) Mincione, F.; Starnotti, M.; Menabuoni, L.; Scozzafava, A.;    Casini, A.; Supuran, C. T. Bioorg. Med. Chem. Lett. 2001, 11, 1787.

All publications, patents and sequence database entries mentionedherein, including those items listed below, are hereby incorporated byreference in their entirety as if each individual publication or patentwas specifically and individually indicated to be incorporated byreference. In case of conflict, the present application, including anydefinitions herein, will control.

EQUIVALENTS AND SCOPE

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments of the invention described herein. The scope of the presentinvention is not intended to be limited to the above description, butrather is as set forth in the appended claims.

In the claims articles such as “a,” “an,” and “the” may mean one or morethan one unless indicated to the contrary or otherwise evident from thecontext. Claims or descriptions that include “or” between one or moremembers of a group are considered satisfied if one, more than one, orall of the group members are present in, employed in, or otherwiserelevant to a given product or process unless indicated to the contraryor otherwise evident from the context. The invention includesembodiments in which exactly one member of the group is present in,employed in, or otherwise relevant to a given product or process. Theinvention also includes embodiments in which more than one, or all ofthe group members are present in, employed in, or otherwise relevant toa given product or process. Furthermore, it is to be understood that theinvention encompasses all variations, combinations, and permutations inwhich one or more limitations, elements, clauses, descriptive terms,etc., from one or more of the claims or from relevant portions of thedescription is introduced into another claim. For example, any claimthat is dependent on another claim can be modified to include one ormore limitations found in any other claim that is dependent on the samebase claim. Furthermore, where the claims recite a composition, it is tobe understood that methods of using the composition for any of thepurposes disclosed herein are included, and methods of making thecomposition according to any of the methods of making disclosed hereinor other methods known in the art are included, unless otherwiseindicated or unless it would be evident to one of ordinary skill in theart that a contradiction or inconsistency would arise.

Where elements are presented as lists, e.g., in Markush group format, itis to be understood that each subgroup of the elements is alsodisclosed, and any element(s) can be removed from the group. It is alsonoted that the term “comprising” is intended to be open and permits theinclusion of additional elements or steps. It should be understood that,in general, where the invention, or aspects of the invention, is/arereferred to as comprising particular elements, features, steps, etc.,certain embodiments of the invention or aspects of the inventionconsist, or consist essentially of, such elements, features, steps, etc.For purposes of simplicity those embodiments have not been specificallyset forth in haec verba herein. Thus for each embodiment of theinvention that comprises one or more elements, features, steps, etc.,the invention also provides embodiments that consist or consistessentially of those elements, features, steps, etc.

Where ranges are given, endpoints are included. Furthermore, it is to beunderstood that unless otherwise indicated or otherwise evident from thecontext and/or the understanding of one of ordinary skill in the art,values that are expressed as ranges can assume any specific value withinthe stated ranges in different embodiments of the invention, to thetenth of the unit of the lower limit of the range, unless the contextclearly dictates otherwise. It is also to be understood that unlessotherwise indicated or otherwise evident from the context and/or theunderstanding of one of ordinary skill in the art, values expressed asranges can assume any subrange within the given range, wherein theendpoints of the subrange are expressed to the same degree of accuracyas the tenth of the unit of the lower limit of the range.

In addition, it is to be understood that any particular embodiment ofthe present invention may be explicitly excluded from any one or more ofthe claims. Any embodiment, element, feature, application, or aspect ofthe compositions and/or methods of the invention, can be excluded fromany one or more claims. For purposes of brevity, all of the embodimentsin which one or more elements, features, purposes, or aspects isexcluded are not set forth explicitly herein.

What is claimed is:
 1. A method for reactivity-dependent polymerasechain reaction, comprising: (i) providing a nucleic acid template,wherein the nucleic acid template comprises: a first primerhybridization site, optionally, a sequence tag, a second primerhybridization site, and a candidate reactive moiety; (ii) contacting thenucleic acid template with a first primer, wherein the first primercomprises: a sequence complementary to the first primer hybridizationsite, a third primer hybridization site, and a target reactive moiety;(iii) incubating the nucleic acid template contacted with the firstprimer under conditions suitable for the candidate reactive moiety toform a covalent bond with the target reactive moiety; (iv) incubatingthe nucleic acid template contacted with the first primer underconditions suitable for covalently bound first primer to hybridize withthe first primer hybridization site of the nucleic acid template it iscovalently bound to and for primer extension; (v) contacting the nucleicacid template contacted with the first primer with a PCR primercomplementary to the second primer hybridization site and a PCR primercomplementary to the third primer hybridization site or with a PCRprimer complementary to the second and third primer hybridization site;and (vi) performing a polymerase chain reaction to amplify the nucleicacid template or a fragment thereof.
 2. A method forreactivity-dependent polymerase chain reaction, comprising: (i)providing a first nucleic acid molecule, wherein the first nucleic acidmolecule comprises a first primer hybridization site, a second primerhybridization site optionally, a sequence tag, and a candidate reactivemoiety; (ii) contacting the first nucleic acid molecule with a secondnucleic acid molecule, wherein the second nucleic acid moleculecomprises a nucleic acid sequence complementary to the first primerhybridization site of the first nucleic acid molecule, and a thirdprimer hybridization site, and a target reactive moiety; (iii)incubating the first nucleic acid molecule contacted with the secondnucleic acid molecule under conditions suitable for the candidatereactive moiety to form a covalent bond with the target reactive moiety;(iv) incubating the first nucleic acid molecule covalently bound to thesecond nucleic acid molecule under conditions suitable for hybridizationof the nucleic acid sequence complementary to the first primerhybridization site to the first primer hybridization site and for primerextension; (v) contacting the first nucleic acid molecule covalentlybound to the second nucleic acid molecule with a primer complementary tothe second primer hybridization site and a primer complementary to thethird primer hybridization site; and (vi) performing a polymerase chainreaction (PCR) to amplify the nucleic acid template or a fragmentthereof.
 3. A method for reactivity-dependent polymerase chain reaction,comprising: (i) providing a first nucleic acid molecule, wherein thefirst nucleic acid molecule comprises a first primer hybridization site,optionally, a sequence tag adjacent to the first primer hybridizationsite, optionally, a restriction site adjacent to the sequence tag, and acandidate reactive moiety; (ii) contacting the first nucleic acidmolecule with a second nucleic acid molecule, wherein the second nucleicacid molecule comprises a nucleic acid sequence complementary to thefirst primer hybridization site of the first nucleic acid molecule, athird primer hybridization site, and a target reactive moiety; (iii)incubating the first nucleic acid molecule contacted with the secondnucleic acid molecule under conditions suitable for the candidatereactive moiety to form a covalent bond with the target reactive moiety;(iv) incubating the first nucleic acid molecule covalently bound to thesecond nucleic acid molecule under conditions suitable for hybridizationof the nucleic acid sequence complementary to the first primerhybridization site to the first primer hybridization site and for primerextension; (v) optionally, contacting the first nucleic acid moleculecovalently bound to the second nucleic acid molecule with a restrictionenzyme under conditions suitable for restriction enzyme digest; (vi)contacting the first nucleic acid molecule covalently bound to thesecond nucleic acid molecule with a nucleic acid linker, wherein thelinker comprises a second primer hybridization site; (vii) incubatingthe first nucleic acid contacted with the linker under conditionssuitable to ligate the linker to the first nucleic acid molecule; (viii)contacting the first nucleic acid molecule contacted with the secondnucleic acid molecule with a PCR primer complementary to the secondprimer hybridization site and a PCR primer complementary to the thirdprimer hybridization site; and (ix) performing a polymerase chainreaction (PCR) to amplify the nucleic acid template or a portionthereof.
 4. The method of claim 1, wherein the first primerhybridization site is between 5 and 16 nucleotides long.
 5. The methodof claim 1, wherein the second and the third primer hybridization sitecomprise the same nucleic acid sequence.
 6. The method of claim 1,wherein the primer complementary to the second primer hybridization siteand the primer complementary to the third primer hybridization sitecomprise the same nucleic acid sequence.
 7. The method of claim 1,wherein the second and the third primer hybridization site comprisedifferent nucleic acid sequences.
 8. The method of claim 1, wherein thefirst primer hybridization site and the third primer hybridization siteoverlap or are identical.
 9. The method of claim 1, wherein the sequencetag sequence is 5-30 nucleotides long.
 10. The method of claim 1,wherein the candidate and/or the target reactive moiety is a functionalgroup selected from the group consisting of alkenyl, alkynyl, phenyl,benzyl, halo, haloformyl, hydroxyl, carbonyl, aldehyde, carbonate ester,carboxylate, carboxyl, ether, ester, carboxyamide, amine, ketimine,aldimine, imide, azido, diimide, cyanate, isocyanide, isocyanate,isothiocyanate, nitrile, sulfide, and disulfide groups.
 11. The methodof claim 1, wherein the covalent bond formed between the candidate andthe target reactive moiety is an amide bond, an acyl bond, a disulfidebond, an alkyl bond, an ether bond, or an ester bond.
 12. The method ofclaim 1, wherein the covalent bond formed between the candidate and thetarget reactive moiety is a carbon-carbon bond, a carbon-oxygen bond, acarbon-nitrogen bond, a carbon-sulfur bond, a sulfur-sulfur bond, acarbon-phosphorus bond, a phosphorus-oxygen bond, a phosphorus-nitrogenbond.
 13. The method of claim 1, wherein the covalent bond between thecandidate and the target reactive moiety is formed by an acylation, anaddition, a nucleophilic substitution, a Huisgen cycloaddition, acarbonyl chemistry reaction, a “non aldol”-type carbonyl chemistryreaction, or an addition to carbon-carbon double or triple bonds. 14.The method of claim 1, further comprising contacting the first nucleicacid molecule contacted with the second nucleic acid molecule with acatalyst or reagent catalyzing a covalent bond-forming reaction betweenthe candidate reactive moiety and the target reactive moiety undersuitable conditions.
 15. The method of claim 1, further comprisingcontacting the nucleic acid template contacted with the first primerwith a catalyst or reagent catalyzing a covalent bond-forming reactionbetween the candidate reactive moiety and the target reactive moietyunder suitable conditions.
 16. The method of claim 1, wherein theconditions suitable for hybridization of the nucleic acid sequencecomplementary to the first primer hybridization site to the first primerhybridization site and primer extension are conditions not allowing forefficient primer site hybridization and primer extension of the firstand second nucleic acid molecules not connected by a covalent bond. 17.The method of claim 1, wherein the conditions suitable for hybridizationof the nucleic acid sequence complementary to the first primerhybridization site to the first primer hybridization site and primerextension are conditions not allowing for efficient primer sitehybridization and primer extension of first primer not covalently boundto the nucleic acid template.
 18. The method of claim 1, wherein the PCRis quantitative, real-time PCR.
 19. A method for reactivity-dependentpolymerase chain reaction screening of a reactive moiety library,comprising: (i) providing a library of nucleic acid templates, whereineach nucleic acid template comprises: a first primer hybridization site,a sequence tag, and a second primer hybridization site, and a candidatereactive moiety, and wherein the candidate reactive moiety of anyspecific nucleic acid template can be identified by its sequence tagsequence; (ii) contacting the nucleic acid templates of said librarywith a first primer comprising: a sequence complementary to the firstprimer hybridization site, a third primer hybridization site, and atarget reactive moiety; (iii) incubating the nucleic acid templatescontacted with the first primer under conditions suitable for thecandidate reactive moiety to form a covalent bond with the targetreactive moiety; (iv) incubating the nucleic acid templates of thelibrary contacted with the first primer under conditions suitable forcovalently bound first primer to hybridize with the first primerhybridization site of the nucleic acid template it is covalently boundto and for primer extension; (v) contacting the nucleic acid templatescontacted with the first primer with a PCR primer complementary to thesecond primer hybridization site and a PCR primer complementary to thethird primer hybridization site, or a PCR primer complementary to thesecond and the third primer hybridization site; and (vi) performing apolymerase chain reaction amplifying a nucleic acid template sequencetag identifying a candidate reactive moiety able to form a covalent bondto the second reactive moiety.
 20. A method for reactivity-dependentpolymerase chain reaction, comprising: (i) providing a nucleic acidtemplate, wherein the nucleic acid template comprises a first primerhybridization site, optionally, a sequence tag, a second primerhybridization site, and a reactive moiety; (ii) contacting the nucleicacid template with a first primer, wherein the first primer comprises: asequence complementary to the first primer hybridization site, a thirdprimer hybridization site, and a reactive moiety; (iii) incubating thenucleic acid template contacted with the first primer under conditionssuitable for the candidate reactive moiety to form a covalent bond withthe target reactive moiety either only in the presence or only in theabsence of an environmental parameter or analyte; (iv) incubating thenucleic acid template contacted with the first primer under conditionssuitable for covalently bound first primer to hybridize with the firstprimer hybridization site of the nucleic acid template it is covalentlybound to and for primer extension; (v) contacting the nucleic acidtemplate contacted with the first primer with a PCR primer complementaryto the second primer hybridization site and a PCR primer complementaryto the third primer hybridization site or with a PCR primercomplementary to the second and third primer hybridization site; and(vi) performing a polymerase chain reaction to amplify the nucleic acidtemplate, or a fragment thereof.